×
Well done. You've clicked the tower. This would actually achieve something if you had logged in first. Use the key for that. The name takes you home. This is where all the applicables sit. And you can't apply any changes to my site unless you are logged in.

Our policy is best summarized as "we don't care about _you_, we care about _them_", no emails, so no forgetting your password. You have no rights. It's like you don't even exist. If you publish material, I reserve the right to remove it, or use it myself.

Don't impersonate. Don't name someone involuntarily. You can lose everything if you cross the line, and no, I won't cancel your automatic payments first, so you'll have to do it the hard way. See how serious this sounds? That's how serious you're meant to take these.

×
Register


Required. 150 characters or fewer. Letters, digits and @/./+/-/_ only.
  • Your password can’t be too similar to your other personal information.
  • Your password must contain at least 8 characters.
  • Your password can’t be a commonly used password.
  • Your password can’t be entirely numeric.

Enter the same password as before, for verification.
Login

Grow A Dic
Define A Word
Make Space
Set Task
Mark Post
Apply Votestyle
Create Votes
(From: saved spaces)
Exclude Votes
Apply Dic
Exclude Dic

Click here to flash read.

A popular approach to unveiling the black box of neural NLP models is to
leverage saliency methods, which assign scalar importance scores to each input
component. A common practice for evaluating whether an interpretability method
is faithful has been to use evaluation-by-agreement -- if multiple methods
agree on an explanation, its credibility increases. However, recent work has
found that saliency methods exhibit weak rank correlations even when applied to
the same model instance and advocated for the use of alternative diagnostic
methods. In our work, we demonstrate that rank correlation is not a good fit
for evaluating agreement and argue that Pearson-$r$ is a better-suited
alternative. We further show that regularization techniques that increase
faithfulness of attention explanations also increase agreement between saliency
methods. By connecting our findings to instance categories based on training
dynamics, we show that the agreement of saliency method explanations is very
low for easy-to-learn instances. Finally, we connect the improvement in
agreement across instance categories to local representation space statistics
of instances, paving the way for work on analyzing which intrinsic model
properties improve their predisposition to interpretability methods.

Click here to read this post out
ID: 124331; Unique Viewers: 0
Unique Voters: 0
Total Votes: 0
Votes:
Latest Change: May 13, 2023, 7:31 a.m. Changes:
Dictionaries:
Words:
Spaces:
Views: 8
CC:
No creative common's license
Comments: