Dr Yuval Pinter, Ben Gurion University of the Negev, Isarel
Challenging and Adapting NLP Models to Lexical Phenomena
12 October 2021
Over the last few years, deep neural models have taken over the field of natural language processing (NLP), brandishing great improvements on many of its sequence-level tasks. But the end-to-end nature of these models makes it hard to figure out whether the way they represent individual words aligns with how language builds itself from the bottom up, or how lexical changes in register and domain can affect the untested aspects of such representations.
In this talk, I will present NYTWIT, a dataset created to challenge large language models at the lexical level, tasking them with identification of processes leading to the formation of novel English words, as well as with segmentation and recovery of the class of novel blends. I will then present XRayEmb, a method which alleviates the hardships of processing these novelties by fitting a character-level encoder to the existing models’ subword tokenizers; and conclude with a discussion of the drawbacks of current tokenizers’ vocabulary creation schemes.
Yuval Pinter is a Senior Lecturer in the Department of Computer Science at Ben-Gurion University of the Negev, focusing on NLP. Yuval got his PhD at the Georgia Institute of Technology School of Interactive Computing as a Bloomberg Data Science PhD Fellow. Before that, he worked as a Research Engineer at Yahoo Labs and as a Computational Linguist at Ginger Software, and obtained an MA in Linguistics and a BSc in CS and Mathematics, both from Tel Aviv University. Yuval blogs (in Hebrew) about language matters on Dagesh Kal.
Aleks Sandor Milovanovic and Dora Murgu, Interprefy
The backstage of a hybrid event – a complex string puppet called RSIBOX
15 October 2021
Hybrid events have been at the core of Interprefy since its creation in 2014 when remote simultaneous interpreting (RSI) was only accepted as a sideline to in-person events, where complex language pairs or space restrictions could require expanding the pool of in-person interpreting teams to one that also included remote participation. The real breakthrough came in 2018 when Interprefy won their first UN tender and the International Seabed Authority signed on with Interprefy as the first UN agency to replace onsite interpreters for their major meetings with remote interpreters for a whooping cost savings of almost a million dollars. From there it went strength to strength and culminated at WHA73 which was watched by a total of 800 million people worldwide, being the first world health assembly that was fully online in the history of World Health Organizatio).
At Interprefy we have developed our own plug and play equipment (RSIBOX) which can be used onsite for seamless bridge between AV and Remote setups. The RSIBOX originated from experimentation in hybrid environments and is a piece of hardware that has been used on most football championships, Euro 2020 being the most prominent example.
During this webinar Aleks and Dora will speak about what goes on backstage for a seamless hybrid event and discuss the technology behind our RSIBOX. This webinar is oriented at EM TTI students who have a particular interest in interpreting technology, AV systems and hardware.
Dora Murgu. Romanian born and Spanish bred, Dora started her career as a conference interpreter. She soon transitioned into the backstage of interpretation services after creating a pioneering training program for OPI which she later taught at universities across Spain for over six years. She has presented several papers at major industry conferences and published articles on interpreting quality management, interpreter training and OPI service provision in Spain. She has worked for major LSPs and RSI providers for the past 13 years and currently holds the position of Interpreter Engagement Manager at Interprefy, one of the leading RSI platforms on the market. When she’s not immersed in the world of interpreters she threads the waters of the Arabian Gulf with her SUP board in Dubai, where she lives with her family.
Aleks Sandor Milovanovic. Raised in South Africa, Hungarian citizen Aleks Sandor moved to Switzerland in 2014. As one of the most senior members of Interprefy (the 3rd to be precise) he built the original Operations Team for which he was responsible during the first startup phase of the company. Shortly before COVID hit he created the Special Operations Department to more efficiently respond to a high demand of very sensitive clients such as the UN, IMF and UEFA. The innovation that stemmed from his leadership included the Interprefy Gateway solution which was first used at the Google PES 2018 and notably at the UN Hybrid Rooms setup which enabled UN to resume their operations after nearly three months of meetings without interpretation. In his spare time, Aleks enjoys kayaking and cycling around lake Zurich.
Neural MT can facilitate communication in a way that surpasses previous MT paradigms, but there are also consequences of its use. As with the development of any technology, MT is not ethically neutral, but rather reflects the values of those behind its development. This talk considers the ethical issues around MT, beginning with data gathering and reuse and looking at how MT fits with the values and codes of the translator. If machines and systems reflect value systems, can they be explicitly ‘good’ and remove bias from their output? What is the contribution of MT to discussions of sustainability and diversity? Rather than promoting an approach that involves following a set of instructions to implement a technology unthinkingly, this talk will highlight the importance of a conscious decision-making process when designing a data-driven MT workflow.
Joss Moorkens is an Associate Professor and Chair of postgraduate translation programmes at the School of Applied Language and Intercultural Studies at Dublin City University. He is also a Funded Investigator with the ADAPT Centre and a member the Centre for Translation and Textual Studies. He has authored over 50 journal articles, book chapters, and conference papers on translation technology, user interaction with and evaluation of machine translation, translator precarity, and translation ethics. He is General Coeditor of the journal Translation Spaces with Prof. Dorothy Kenny, and coedited the book ‘Translation Quality Assessment: From Principles to Practice’, published in 2018 by Springer, and special issues of Machine Translation (2019) and Translation Spaces (2020). He leads the Technology working group (with Prof. Tomas Svoboda of Charles University) as a board member of the European Masters in Translation network and sits on the advisory board of the Journal of Specialised Translation.
A number of staff and students recently attended RANLP 2021 – online this year due to the ongoing pandemic – however, as you can see from the reports below, still a lively and engaging conference.
Another successful online conference I attended is RANLP. I enjoyed the ability to attend many sessions and workshops from the comfort of my home office😊 RANLP had very interesting keynote speeches, they were quite informative on the ongoing research trends for different NLP groups all over the world. My online presentation went very well, the only thing that I was missing was to see the attendees reactions while talking. I can either concentrate on my slides or the participants 😊 But I was happy from how the research ideas were interesting to many. The RANLP workshops were also excellent. Researchers from top-notch universities gave very interesting presentations. Looking forward to repeating this wonderful experience in the future.
Hadeel Saadany – PhD Student
I recently participated in RANLP 2021 (Recent Advances in Natural Language Processing). RANLP has established itself over the years as one of the most influential and competitive NLP conferences. This year, due to the COVID situation in many countries, organisers decided to keep the conference virtually using the zoom technology.
RANLP 2021 had excellent keynote speeches from top researchers in NLP around the world. The RANLP organisers made sure that there were at least three keynote speeches for a day. Usually, the day started with a keynote speech. There was another keynote speech after the lunch break and the day concluded with a keynote speech. Day 1 in RANLP 2021 began with a keynote speech from Dr Jing Jiang in Singapore Management University. She talked about the latest research on question and answering. In the afternoon, we had a keynote speech from Prof Josef van Genabith and Nico Herbig on translation technologies. They talked about implementing a multimodal user interface for post-editing. Day 2 in RANLP started with a keynote speech from Prof Hwee Tou Ng where he talked about current and future research directions in grammatical error correction in texts. After the lunch break, we had a keynote speech from Prof Constantin Orasan. He provided a very informative session on preserving sentiment in machine translations. Since he is the first supervisor in my PhD studies, he also talked about the research we did on translation quality estimation for my PhD. Therefore, this session was special for me. Later, in the afternoon we had a keynote from Dr He He at New York University about text generation. She talked about the latest developments in the text generation area, including neural transformers. The final day in RANLP started with a keynote speech from Prof Tim Baldwin about text summarisation and the evaluation of text summarisation methods. As the second keynote speech of the day, we had a session by Prof Sebastian Riedel where he talked about learning from knowledge bases and reasoning in machine learning models. As the final keynote, we had a session with Prof Alessandro Moschitti. He presented a very informative session on recent developments of question and answering. Overall, all of the keynote speeches in RANLP were enlightening and provided useful insight knowledge about the state-of-the-art in several NLP topics. It was great to listen to the pioneers of the field and hear their first-hand experiences.
In RANLP 2021, I was fortunate to be a session chair of two parallel sessions. My first parallel session was on the 1st of September, which contained four long papers about offensive language identification. There were four exciting papers, including offensive language identification in Spanish and Romanian, in that session. The second session I chaired was on the 2nd of September. It contained four fascinating papers on translation technologies. RANLP was my first experience being a session chair, and it was a good opportunity for me. I thank the RANLP organisers for allowing me to be a session chair.
I presented two papers at the conference. I got the opportunity to present my first paper on the 1st of September. It contained the work we did on creating an offensive language identification corpus on a low-resource language, Marathi. I presented the second paper on the 3rd of September, which was on multilingual misinformation identification on COVID-19 tweets, which is timely research. I received very good feedback from the audience for both of the papers with comments to improve in future work. I hope to incorporate them in my future work, and I am glad for the RANLP participants for their valuable ideas.
During the conference, I got the opportunity to get to know several researchers working in the same field from universities worldwide, and the networking was very valuable. However, I did miss the physical presence and all the fun activities in RANLP, such as cocktail receptions and the Gala dinner. I hope that we can have the next RANLP conference physically in Varna, Bulgaria and present at the venue site. Finally, I would like to thank the organisers of RANLP for having the conference despite the difficult situation in the world.
Erasmus Mundus European Master’s in Technology for Translation and Interpreting (EM TTI)
Call for applications for start date September 2022
Scholarship application deadline: 15th January 2022
Self-funded application deadline: 1st July 2022
We invite applications for the new Erasmus Mundus European Master’s programme in Technology for Translation and Interpreting (EM TTI) with a start date of September 2022. The programme, run by University of Wolverhampton, University of Malaga, New Bulgarian University and Ghent University, offers students the opportunity to study at two international institutions and to undertake work placements with industry leaders around the world. A competitive Erasmus Mundus scholarship is offered to the highest-ranking applicants. Both European and non-European students can apply.
Dr Parthena Charalampidou, University of Thessaloniki
Storytelling and multimodal metaphors in technical and operative content of multilingual corporate websites.
1 October 2021
Technical Communication constitutes a prerequisite for a product’s safe and efficient usage, as well as an inextricable part of its dissemination processes and branding strategy. It has to be localized, i.e. culturally adapted to the countries in which a company’s products or services are marketed, supporting their respective languages, and optimized for multilingual SEO. Traditionally, Technical Communication was offered in printed form only and took place through written discourse usually accompanied by supporting images. However, with the advent of technology and the development of digital means of communication, Technical Communication has transformed into a multisemiotic and multimodal form of communication. Dynamic pictures and videos have replaced static technical content found in imagetexts. Moreover, interactive elements allow users to share their personal experiences with the product and even become producers of Technical Communication content themselves (Kimball, 2006).
In this context, technical content is no longer isolated from the company’s marketing strategy but is rather very often integrated into it through the hypermodal possibilities offered by the multimedial context in which it occurs. The brand’s storytelling can then take various forms and can become intertwined, through different traversals, with the product’s technical documentation. Thus, although technical content was formally considered mainly informative, new realities reveal that technical content can be both operative and expressive, in line with the marketing story of the brand.
In this talk we will address this new form of multimodal technical content and the development of digital storytelling in localized and international corporate website versions. We will examine, comparatively and contrastively, the multisemiotic narratives that are being developed in different cultural contexts, in order to appeal to different audiences, either local or international ones. Particular attention will be given to multimodal rhetorical tropes such as multimodal metaphors and the way they contribute to a corporate website’s narrative. Multimodal metaphors’ culture-specificity is expected to unveil discrepancies in different language versions.
Parthena Charalampidou holds a BA in English Language and Literature, an MA in Language and Communication Sciences and a PhD in Translation and Website Localization from Aristotle University of Thessaloniki. Her research interests revolve around semiotic, rhetorical and cultural approaches to translation and she is particularly interested in the localization of promotional digital genres (transcreation) and in the application of technology and corpora to translation. Currently, she teaches Localization and Multimodal translation at the department of Translation, School of French Language and Literature, Aristotle University of Thessaloniki. She is also a member of the teaching staff of the Joint EMT Postgraduate Programme “Interpreting and Translation” and has been a Visiting Scholar of the Erasmus Mundus Master Programme ‘Technology for Translation and Interpreting’ for the spring semester of 2020-2021. She has worked as a freelance translator and she is a member of scientific associations for translation and semiotics. She has participated in national and international conferences and her research has been published in various scientific journals, volumes and conference proceedings. She has recently translated Miguel Jimenez Crespo’s book “Translation and Web Localization” in Greek.