A tool for Mining and Exploring arguments in US Presidential Election Debates From 1960 to 2016.
Political debates are the means used by political candidates to put forward and justify their positions in front of the electors with respect to the issues at stake. Argument mining is a novel research area in Artificial Intelligence, aiming at analyzing discourse on the pragmatics level and applying a certain argumentation theory to model and automatically analyze textual data. DISPUTool is a tool designed to ease the work of historians and social science scholars in analyzing the argumentative content of political speeches. More precisely, DISPUTool allows to explore and automatically identify argumentative components over the 39 political debates from the last 50 years of US presidential campaigns (1960-2016).
Researchers, students and engineers have contributed to develop an AI algorithm capable of locating, identifying and quantifying the species present on footages.
The Mercantour national Park wishes to monitor and identify common and rare species present on its territory from videos extracted from camera traps located on the Roya Valley.
An algorithm composed of convolutional neural networks (Faster-RCNN and InceptionResNetv2) has been implemented thus gaining in efficiency to count and estimate the fauna present.
The Metropole Nice Côte d’Azur (MNCA) launched an initiative starting in 2021 to study the environmental, social and economical factors influencing the health of its citizen. The 3IA Côte d’Azur was part of this initiative through a collaborative project gathering the MNCA, the University Hospital of Nice, and the economy laboratory of University Côte d’Azur. The mid-term objective of this initiative is to provide geo-localized information helping the guidance and evaluation of public policies. A first use case of this project, aimed at identifying the geographical disparities of breast cancer screening for women aged from 50 to 74 in the department of Alpes-Maritimes. We make this analysis at the scale of the smallest available administrative territory (IRIS) as defined by French national statistics Institute (INSEE), it allows to have enough statistical information at smallest area and avoid working on individuals’ data subject to GDPR.
A first task of the 3IA Côte d’Azur was to perform data qualification of various open or restricted databases provided by public French institutions (INSEE, IGN, CPAM06, CAF06, MNCA, CHU Nice). Databases were related to 5 major themes (transport, demographics, environment, healthcare, and geography) and they have been audited to detect missing values, outliers and inconsistent data between several databases. After data cleaning, databases were visualized and analysed by the various partners of the project.
Below the map of participation rate of breast cancer screening (either individual or organised screening) for women aged from 50 to 74 from 17/06/2019 to 17/06/2021 (source CPAM06).
The 3IA Côte d’Azur has also participated in the study of the difficulty to access screening centers through personal vehicles or public transportation at different locations of the department of Alpes-Maritimes. More specifically, the estimation of various accessibility indices to reach certified radiological centers for breast cancer screening was performed as shown the maps below.
Person re-identification (ReID) is an illustration of a fundamental problem in Computer Vision, how to represent an object (i.e. its appearance) independently from its perception (i.e. its pose, view-point, illumination and the given sensor). As a core component of intelligent video surveillance system, person ReID has attracted increasing attention in computer vision research community. Given a query person image, a person ReID system seeks the most similar images in the gallery by sorting representation similarity. Although rapid improvements has been witnessed in recent years, person ReID still suffers from real-world environmental factors, such as illumination, pose, view-point and background variance, which degrade the quality of image representations. How to build robust representations that are invariant to the environmental factors with limited data remains a key problem for person ReID. In addition, annotating a large-scale person ReID dataset is a cumbersome task, which strongly limits the scalability of supervised ReID methods.
In our work, we first study how to build robust representations for supervised person ReID. Then, we remove the human supervision and design unsupervised algorithms to enhance the flexibility of person ReID.
Source code available here.
Hao Chen, Benoit Lagadec, Francois Bremond.
In the demo "Where is Charlie ?" we will show and compare the un-supervised ReID model vs. supervised ReID model over synthetic person dataset (PersonX dataset).
In the demo "Where is my car ?", we compare these models over a dataset composed of cars in real trafic (VeRi dataset).
Where is Charlie ?
Where is my car ?
Standard machine learning approaches require to have a centralized dataset in order to train a model. In certain scenarios like in the biomedical field, this is not straightforward due to several reasons like: privacy concerns, ethical committee approval or transferring of data to a centralized location. This slows down research in healthcare and limits the generalization of certain models.
Fed-BioMed is an open source project focused on empowering biomedical research using non-centralized approaches for statistical analysis.
Federated Learning (FL) is a machine learning procedure whose goal is to train a model without having data centralized. The goal of FL is to train higher quality models by having access to more data than centralized approaches, as well as to keep data securely decentralized. The main challenges are associated to: communication efficiency, data heterogeneity and security
We will show and compare the machine learning procedure of a DenseNet121 neural network model over dataset MedNIST in standalone vs. in Federated Learning.
Fed-BioMed is a project developed by INRIA, in the EPIONE team. Manager: Marco Lorenzi. Source code available here.
The first thing to do is to log in and access to the "New Job" page. Several plugins will be available for retrieving data from various sources.
We build graph data from co-authors or co-citations in scientific papers and we use the abstracts as textual content to analyze.
A good example for discovering the Linkage technology is running the algorithm over a set of emails. With our platform you can import a .mbox file that some email utilities can provide for you.
A quite useful plugin for social media analysis, which is a major Linkage usecase. We provide a way of importing Tweets (from the past week, we use the basic Twitter API) based on a keyword (or a mention, or a hashtag). We are able to build graph data using the mentions (@) or retweets (RT) that we will find in the retrieved tweets containing the keyword. Do not hesitate to increase the number of tweets to request (up to 10 000).
Our plugins are not tailored for every usecase, so we provide a tool to import your own data. Your data must be formatted as .csv and have a specific structure.
Here is an example :
node 1, node 2, Whole textual content (if several communications exists, concatenate them) sent from node 1 to node 2.
node A, node B, Whole textual content sent from node A to node B.
node 2, node 1, Whole textual content sent from node 2 to node 1.
Nota Bene about Twitter: if you want to provide your own network based on twitter data, you should know that we build graph data from Twitter following these rules :
- A "communication" from A to B exists when A mentions (@) B or when A retweets B.
- When it is a mention: the textual content associated with the communication is the whole tweet containing the mention.
- When it is a RT: the textual content associated with the communication is the content added with the RT (not the original Tweet).
We also provide several demonstration datasets:
- Enron Scandal email dataset: all of the emails exchanged inside the Enron company around the period of the "Enron scandal", a financial fraud affair that happened in the beginning of the 2000s and led the company into bankruptcy (this usecase is described in this article).
- A small simulated dataset containing exactly 3 clusters and 4 topics (that is used to validate the algorithm).
- A dense network of 1000 nodes containing 2 individuals communicating suspiciously (that highlights the ability of Linkage to detect weak signals).
Once you have imported data with the platform, it is possible to re-run the algorithm on your existing jobs, with the same or with different parameters. As our method is fully unsupervised, it can be accurate to iterate over some jobs.
Once you have successfully launched a job, you can follow its progression on the "Jobs" page.
Once the job is completed, click on the "View" button to access our visualization app. If you want to work with your own scripts, you can download the whole set of results under .zip format.
On the sidebar on the left part of the screen, a detail of the different communities existing in the network is provided (Clusters), so as details about the different communication topics (Topics). For each cluster, you can visualize every single node sorted by importance ("show nodes"). For each topic, you can visualize every single words sorted by relevance ("show words").
On the main panel, we provide a visualization of the clustering results as interactive graphs: the default display is a synthetic graph showing the different clusters and topics as a meta-network. Sometimes, and especially when the number of clusters is high, some edges can overload the visualization and represent weak connections. It is then possible to filter them by changing the cluster-to-cluster cutoff (on the Advanced panel of the sidebar).
Another interesting feature is Expanding the meta-network in order to visualize the full network labelled with our clustering results ("Expand Clusters").
Nota Bene: if the network is really large, please consider using the WebGL mode for display ("Use WebGL" in the Advanced tab).
If the network is not too large, you can visualize an Adjacency Matrix that shows the top topics per edge of the network. Finally, the "Statistics" tab provides additional information such as the clusters or topics weights and also the top words for each topic.
Together with our platform, it is important to know that we have developped a set of REST APIs that allows you to interact with Linkage with R or Python:
-Click here and go to "Developers" page for the Python documentation.
-Click here for the R package.
Video analytics enables us to measure objectively the behavior of humans by recognizing their everyday activities, their emotion, eating habits and lifestyle. Human behavior can be modeled by learning from a large quantity of data from a variety of sensors to improve and optimize, for instance, the quality of life of people suffering from behavior disorders.
Embedding Artificial Intelligence onto low-power devices is a challenging task that has been partly overcome with recent advances in machine learning and hardware design. Presently, deep neural networks can be deployed on embedded targets to perform different tasks. However, there is still room for optimization of deep neural networks onto embedded devices. These optimizations mainly address power consumption, memory and real-time constraints, but also an easier deployment at the edge. Moreover, there is still a need for a better understanding of what can be achieved for different use cases.
This project focuses on quantization and deployment of deep neural networks onto low-power 32-bit microcontrollers. A new framework for end-to-end deep neural networks training, quantization and deployment is proposed. This framework, called MicroAI, is designed as an alternative to existing inference engines. Our framework can indeed be easily adjusted and/or extended for specific use cases.
We specifically demonstrate the efficiency of our framework in the case of Human Activity Recognition with the connected glasses from Ellcie-Healthy company. The prediction is made on-board based on IMU data, preserving privacy and energy concumption. A specific dataset has been created in colllaboration with Ellcie-Healthy, named UCA-EHAR. Thanks to MicroAI, the embedded neural network only needs 200 kB of flash and 130 kB of RAM and can reach 24 hours of autonomy with its rechargeable battery.
Pierre-Emmanuel Novac (PhD student, UCA), Dr. Alain Pegatoquet (UCA), Pr. Benoît Miramond (UCA, 3IA), Dr. Christophe Caquineau (Ellcie-Healthy)
Developed by expert.ai and the University of Siena, WebCrow is the first AI-based software that tackles a language game using NLP technology and the richest self-updating repository of human knowledge available: the web. Designed to be multilingual, WebCrow is the world’s first Italian and English crossword puzzle solver.