About Found In Translation

The purpose of this website is to demonstrate the state of the art in various disciplines of modern Artificial Intelligence.

In particular, we make use of state of the art Machine Translation tools (Moses [1] implementation of Phrase-Based Statistical Machine Translation [2]) and Text Categorization tools (SVMlight [3] implementation of Support Vector Machines [4]). News items are gathered daily from the main pages of the top newspapers in each of the 27 countries of the European Union (560 outlets in 22 languages), machine-translated into english, then automatically classified based on topic. Finally statistics about the popularity of various topics in each country are compiled, and displayed in graphical representation by color maps and histograms. The result is a global view of the contents of the european mediasphere which would be impossible to obtain without the use of modern AI technology. It is an important detail that both Machine Translation and Text Categorization methods used in this system are driven by data, that is they are based on machine learning technology. The integration of Support Vector Machines, Statistical Machine Translation, Web Technologies and Computer Graphics delivers a complete system where modern Statistical Machine Learning is used at multiple levels and is a crucial enabling part of the resulting functionality.

This is an OUTREACH project for the SMART and PASCAL consortia, and was created by the Pattern Analysis group of the University of Bristol.

It was presented in:
Marcho Turchi, Ilias N. Flaounas, Omar Ali, Tijl De Bie, Tristan Snowsill, and Nello Cristianini, "Found in Translation", ECML/PKDD, Bled, Slovenia, Springer, pp. 746-749, 2009.

We wish to thank the various colleagues who have made open source implementations of their tools available, particularly Philip Koehn and Thorsten Joachims.


[1] Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A. and Herbst, E. (2007) Moses: Open source toolkit for statistical machine translation. Annual Meeting of the Association for Computational Linguistics, demonstration session. Columbus, Ohio.
[2] Brown, P. F., Della Pietra, S., Della Pietra, V. J. and Mercer, R. L. (1994) The Mathematic of Statistical Machine Translation: Parameter Estimation. Computational Linguistics. 19: 263-311. Cambridge, MA, USA: MIT Press.
[3] T. Joachims, Making large-Scale SVM Learning Practical. Advances in Kernel Methods - Support Vector Learning, B. Scholkopf and C. Burges and A. Smola (ed.), MIT-Press, 1999.
[4] Boser, B. E., Guyon, I., and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, pp. 144-152. ACM, Madison, WI.