Show simple item record

dc.contributor.author
Gu, Nianlong
dc.contributor.author
Gao, Yingqiang
dc.contributor.author
Hahnloser, Richard H.R.
dc.date.accessioned
2024-02-01T11:19:51Z
dc.date.available
2024-01-26T13:16:57Z
dc.date.available
2024-02-01T11:19:51Z
dc.date.issued
2023-10-10
dc.identifier.other
10.48550/ARXIV.2310.06436
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/655643
dc.description.abstract
We introduce MemSum-DQA, an efficient system for document question answering (DQA) that leverages MemSum, a long document extractive summarizer. By prefixing each text block in the parsed document with the provided question and question type, MemSum-DQA selectively extracts text blocks as answers from documents. On full-document answering tasks, this approach yields a 9% improvement in exact match accuracy over prior state-of-the-art baselines. Notably, MemSum-DQA excels in addressing questions related to child-relationship understanding, underscoring the potential of extractive summarization techniques for DQA tasks.
en_US
dc.language.iso
en
en_US
dc.publisher
Cornell University
en_US
dc.subject
Computation and Language (cs.CL)
en_US
dc.subject
FOS: Computer and information sciences
en_US
dc.subject
Document understanding
en_US
dc.subject
Question answering
en_US
dc.title
MemSum-DQA: Adapting An Efficient Long Document Extractive Summarizer for Document Question Answering
en_US
dc.type
Working Paper
ethz.journal.title
arXiv
ethz.pages.start
2310.06436
en_US
ethz.size
3 p.
en_US
ethz.identifier.arxiv
2310.06436
ethz.publication.place
Ithaca, NY
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02533 - Institut für Neuroinformatik / Institute of Neuroinformatics::03774 - Hahnloser, Richard H.R. / Hahnloser, Richard H.R.
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02533 - Institut für Neuroinformatik / Institute of Neuroinformatics::03774 - Hahnloser, Richard H.R. / Hahnloser, Richard H.R.
en_US
ethz.date.deposited
2024-01-26T13:16:57Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Metadata only
en_US
ethz.rosetta.installDate
2024-02-01T11:19:52Z
ethz.rosetta.lastUpdated
2024-02-01T11:19:52Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=MemSum-DQA:%20Adapting%20An%20Efficient%20Long%20Document%20Extractive%20Summarizer%20for%20Document%20Question%20Answering&rft.jtitle=arXiv&rft.date=2023-10-10&rft.spage=2310.06436&rft.au=Gu,%20Nianlong&Gao,%20Yingqiang&Hahnloser,%20Richard%20H.R.&rft.genre=preprint&rft_id=info:doi/10.48550/ARXIV.2310.06436&
 Search print copy at ETH Library

Files in this item

FilesSizeFormatOpen in viewer

There are no files associated with this item.

Publication type

Show simple item record