Facebook launches Dynaboard an evaluation-as-a-service for NLP

In Natural Language Processing it is very difficult to gauge the performance of a model. Facebook has launched Dynaboard which ranks state-of-the-art language models like BERT, RoBERTa, ALBERT, T5, and DeBERTa on four common NLP tasks. The tasks are-


This content originally appeared on DEV Community and was authored by amananandrai

In Natural Language Processing it is very difficult to gauge the performance of a model. Facebook has launched Dynaboard which ranks state-of-the-art language models like BERT, RoBERTa, ALBERT, T5, and DeBERTa on four common NLP tasks. The tasks are-

  • Natural Language Inference
  • Question Answering
  • Sentiment Analysis
  • Hate Speech

For evaluating the models for these tasks first a new performance evaluation parameter was created that is known as Dynascore.
It takes into consideration different metrics which include

  • Accuracy - how many examples did the model get right as a percentage
  • Compute - To account for computation, we measure the number of examples that a model can process per second on its instance in our evaluation cloud
  • Memory - We average the memory usage over the duration that the model is running, with measurements taken each N seconds
  • Robustness - We evaluate robustness of a model's prediction by measuring changes after adding perturbations to the examples
  • Fairness - we perform perturbations of original datasets by changing, for instance, noun phrase gender (e.g., replacing “sister” with “brother”, or “he” with “they”) and by substituting names with others that are statistically predicative of another race or ethnicity. For the purposes of Dynaboard scoring, a model is considered more “fair” if its predictions don’t change after such a perturbation

Dynascore is calculated by giving different weightage to these metrics and combining them depending on the type of task. First the tasks mentioned above which form the Dynabench were solved statically. Dynaboard has helped to make this process more dynamic.

The objectives achieved by Dynaboard are-

  • Reproducibility
  • Accessibility
  • Backwards Compatibility
  • Forward Compatibility
  • Prediction Costs

To know more about Dynaboard read the official FB blog and to know about further details of implementation read the paper.


This content originally appeared on DEV Community and was authored by amananandrai


Print Share Comment Cite Upload Translate Updates
APA

amananandrai | Sciencx (2021-05-24T19:16:40+00:00) Facebook launches Dynaboard an evaluation-as-a-service for NLP. Retrieved from https://www.scien.cx/2021/05/24/facebook-launches-dynaboard-an-evaluation-as-a-service-for-nlp/

MLA
" » Facebook launches Dynaboard an evaluation-as-a-service for NLP." amananandrai | Sciencx - Monday May 24, 2021, https://www.scien.cx/2021/05/24/facebook-launches-dynaboard-an-evaluation-as-a-service-for-nlp/
HARVARD
amananandrai | Sciencx Monday May 24, 2021 » Facebook launches Dynaboard an evaluation-as-a-service for NLP., viewed ,<https://www.scien.cx/2021/05/24/facebook-launches-dynaboard-an-evaluation-as-a-service-for-nlp/>
VANCOUVER
amananandrai | Sciencx - » Facebook launches Dynaboard an evaluation-as-a-service for NLP. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2021/05/24/facebook-launches-dynaboard-an-evaluation-as-a-service-for-nlp/
CHICAGO
" » Facebook launches Dynaboard an evaluation-as-a-service for NLP." amananandrai | Sciencx - Accessed . https://www.scien.cx/2021/05/24/facebook-launches-dynaboard-an-evaluation-as-a-service-for-nlp/
IEEE
" » Facebook launches Dynaboard an evaluation-as-a-service for NLP." amananandrai | Sciencx [Online]. Available: https://www.scien.cx/2021/05/24/facebook-launches-dynaboard-an-evaluation-as-a-service-for-nlp/. [Accessed: ]
rf:citation
» Facebook launches Dynaboard an evaluation-as-a-service for NLP | amananandrai | Sciencx | https://www.scien.cx/2021/05/24/facebook-launches-dynaboard-an-evaluation-as-a-service-for-nlp/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.