Vast amounts of scholarly knowledge are buried in electronic theses and dissertations (ETDs). ETDs are valuable documents that have been developed at great cost but largely remain unknown and unused. We aim for digital libraries to open up these long documents using computerized text mining and analytics. We add value to the existing systems by providing chapter-level labels and summaries. This allows readers to easily find chapters of interest. We use ETDs to fine-tune language models like BERT and SciBERT, to help better capture the specialized vocabulary present in such documents.
Abu BakarIqra SafderSaeed‐Ul Hassan