Yu Qian is one of UNSILO’s machine-learning experts. I spoke to her in July 2016.
How did you come to be in Denmark?
I’ve lived in Denmark for several years. I did my PhD here, after taking a BSc in bioinformatics. My first degree was in China, at the computing department of the University of Science and Technology of China (USTC), in Hefei.
What was your PhD about?
My PhD was a mixture of IT and statistics – it was about writing software for the purpose of large-scale analysis, looking particularly at genomic variants. As part of the PhD, I did an internship in Seattle and in Iceland. My internship in Iceland was with a Reykjavik company that specialized in analysing the human genome. In Seattle, I spent six months at the University of Washington looking at tools for detecting identity-by-descent, that is, matching DNA segments that show a common ancestor. From the PhD, I worked with many statisticians, and learned how to implement algorithms on a large scale.
Surely it is difficult to do text analytics in a language that is not your own?
I’m a machine-learning expert – I don’t use natural-language processing (NLP) tools. What I do is largely independent of language.
What do you think is the benefit of the kind of text analytics tools UNSILO develops?
One great advantage is speed. Using machine learning makes many processes just faster. Think, for example, of Google’s automation features. When you book a flight ticket, Google automatically adds the details to your calendar. As machine learning develops, we can do things faster still. The UNSILO concept extraction engine, for example, can identify concepts without having to build an ontology.
What software tools do you use for your research?
I barely use a library for my work; I use digital resources. I used to use ReadCube, but now I use Mendeley. I store articles in Mendeley; in this way I can read them on my mobile or my tablet.
The UNSILO concept extraction uses both machine learning and natural-language processing tools to do its work; which do you think is most important?
Machine learning, without a doubt! We need a bit of support from the NLP people, of course, but after all, NLP is just a subset of machine learning. NLP experts are often specialised on one kind of language, mostly English. However, with the advance of deep learning and language modelling, machines can learn a lot about data by themselves, and they can also learn a lot of languages. Moreover, as language involves, machine learning is better at capturing and understanding new words and meanings than any human experts.
You speak both English and Danish – you are clearly good at learning languages! How easy did you find learning Danish?
I learned English at school, while I learned Danish while living here. I find Danish much easier than English, except that the pronunciation is very hard when there is a ‘r’ in the word!
What do you like most about living in Denmark?
Nice people, beautiful countryside, smart colleagues!