Project blog
On this page, we keep you updated about important developments of the new/s/leak project.
News
New/s/leak 2.0: Second project phase finished
We are happy to announce that the second version of the tool new/s/leak has been completed. New/s/leak supports journalists in the evaluation of very large volumes of documents, such as those that repeatedly emerge from leaks of internal company or government data. The approach of the open-source software is the automatic identification of proper names as well as their visualization and filtering along relationship networks in the documents. In this way, one can quickly gain insight into otherwise unmanageable amounts of data and find starting points for journalistic reporting.
News
New/s/leak at WissensWerte 2018
We present new/s/leak at a panel discussion at WissensWerte 2018, Germany’s most important dialogue forum for science journalists. On November 20, 2018 together with panelists from journalism, IT startups, and other universities, we will discuss how artificial intelligence contributes to journalistic work. In case of new/s/leak, we employ machine learning to automatically extract relevant information such as named entities and keywords from texts. This enables us to create interactive comprehensive visualizations of large text collections which contribute to a fast exploration for investigative purposes.
News
Paper accepted at EMNLP 2018 (Brussels, Belgium)
The paper “A Multilingual Information Extraction Pipeline for Investigative Journalism” focusing on the information extraction component of new/s/leak 2.0 has been accepted at the software demonstrations track of the 2018 Conference on Empirical Methods in Natural Language Processing The conference paper is available in the ACL anthology (here).
Abstract: We introduce an advanced information extraction pipeline to automatically process very large collections of unstructured textual data for the purpose of investigative journalism.
News
Paper accepted at SocInfo 2018 conference in St. Petersburg
Newsleak will be presented at the Social Informatics conference 2018 which takes place from 25-28th of September in St. Petersburg, Russia. The conference paper is published in LNCS series of Springer (here). A preprint can be found here.
Abstract: Investigative journalism in recent years is confronted with two major challenges: 1) vast amounts of unstructured data originating from large text collections such as leaks or answers to Freedom of Information requests, and 2) multi-lingual data due to intensified global cooperation and communication in politics, business and civil society.
News
Newsleak 2.0 pre-release software demo
Since the first version of Newsleak, a lot has been improved behind the scenes as well as in the front-end of the software. We want to encourage journalists, to try out a pre-release of Newsleak 2.0 on their own. For this, we provide a software demonstration. This demo is populated with ca. 26,500 documents collected from Wikipedia in four languages (English, German, Hungarian and Spanish) and mostly centered on the topic of World War II.
News
Presentation at #EIJC18 & Dataharvest conference
This Saturday, we present new/s/leak at the European investigative journalism conference (EIJC). Here you can find the slides of our presentation about “Information Extraction and Visualisation for Investigative Journalism”.
If you are interested to try new/s/leak with your own data, visit the Github page containing the Docker setup of our application.
In June, we will publish a detailed blog post on how to setup Hoover and Newsleak to analyze collections on your own machines.
News
Dataharvest Conference #EIJC18
From Thursday 24 to Sunday 27 May 2018, the EIJC 2018 conference (European Investigative Journalism Conference) will take place in Michelen (Belgium). We as newsleak project will participate and discuss requirements and needs of our targeted user group. All about the conference you can find out on this website: https://dataharvesteu.wordpress.com
News
Funding extension
We are happy to announce that the new/s/leak project receives some additional funding from the Volkswagen Stiftung. Until summer 2018, new/s/leak will be extended and refactored to achieve the following goals:
* easy deployment for own usage * comprehensive and detailed documentation * improved user interface * improved information extraction (better keyterm extraction, named entity recognition, support of user dictionaries) * support for multiple languages (among others english, german, spanish, french, arabic, chinese) Follow the updates on this blog to see how far we got :)
News
new/s/leak demo @ SPIEGEL
Now that we’re in the middle of new/s/leak’s home stretch, we had a final demo at SPIEGEL in Hamburg. After some exciting and productive development sprints, we proudly introduced the software to journalists, documentarists and software developers, who gave us the best feedback by playing around with the tool and becoming absorbed in using it. Some evidence:
We also collected some more systematic feedback, which helped us prioritizing the remaining tasks.
News
new/s/leak @ VIP
Last week, new/s/leak had its academic debut in the visualization science community at the Visualization in Practice Workshop, co-located with the IEEE VIS 2016 conference.
Here is the paper documenting the software with a focus on visualization. Needless to say that it’s always fun to present new/s/leak and get more feedback:
Thanks to everyone who came and visited us!