File size: 2,159 Bytes
e25ec19
 
 
 
 
 
 
 
 
 
 
81e86e3
30f9723
 
ec1a43e
e25ec19
 
 
 
 
 
 
 
 
 
 
34da9a3
e25ec19
34da9a3
 
 
 
 
30f9723
34da9a3
30f9723
 
34da9a3
 
 
 
 
 
 
 
e25ec19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
language: 
- en
thumbnail: https://cogcomp.seas.upenn.edu/images/logo.png
tags:
- text-classification
- bart
- xsum
license: cc-by-sa-4.0
datasets:
- xsum
widget:
- text: "<s> Ban Ki-moon was elected for a second term in 2007. </s></s> Ban Ki-Moon was re-elected for a second term by the UN General Assembly, unopposed and unanimously, on 21 June 2011."
- text: "<s> Ban Ki-moon was elected for a second term in 2011. </s></s> Ban Ki-Moon was re-elected for a second term by the UN General Assembly, unopposed and unanimously, on 21 June 2011."

---

# bart-faithful-summary-detector

## Model description

A BART (base) model trained to classify whether a summary is *faithful* to the original article. See our [paper in NAACL'21](https://www.seas.upenn.edu/~sihaoc/static/pdf/CZSR21.pdf) for details. 

## Usage
Concatenate a summary and a source document as input (note that the summary needs to be the **first** sentence). 

Here's an example usage (with PyTorch)
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
  
tokenizer = AutoTokenizer.from_pretrained("CogComp/bart-faithful-summary-detector")
model = AutoModelForSequenceClassification.from_pretrained("CogComp/bart-faithful-summary-detector")

article = "Ban Ki-Moon was re-elected for a second term by the UN General Assembly, unopposed and unanimously, on 21 June 2011."

bad_summary = "Ban Ki-moon was elected for a second term in 2007."
good_summary = "Ban Ki-moon was elected for a second term in 2011."

bad_pair = tokenizer(text=bad_summary, text_pair=article, return_tensors='pt')
good_pair = tokenizer(text=good_summary, text_pair=article, return_tensors='pt')

bad_score = model(**bad_pair)
good_score = model(**good_pair)

print(good_score[0][:, 1] > bad_score[0][:, 1]) # True, label mapping: "0" -> "Hallucinated" "1" -> "Faithful"
```




### BibTeX entry and citation info

```bibtex
@inproceedings{CZSR21,
    author = {Sihao Chen and Fan Zhang and Kazoo Sone and Dan Roth},
    title = {{Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection}},
    booktitle = {NAACL},
    year = {2021}
}
```