PhD candidate, University of New York Tirana
Sequential pattern extraction
The Big Data era is already underway, and we are invited to accommodate large-scale data and make use of the gold mine insights we can get out of them. Handling these high volume, heterogeneous and dynamic data is becoming a must. Hence, next to the problem of collecting and storing the data sea, the subject of efficient learning and adaption in the Big Data context becomes paramount. Amounts of data that we store should be coupled with novel algorithms, customized applications and scalable tools. The aim of my research area centered around Machine Learning is to improve and /or discover new algorithms that will help us extract valuable information from BIG DATA, on the promise to make our life healthier, our tasks effortless, and our businesses more agile and successful.
Abstract: A central issue in the context of smart cities is related to the capability to acquire timely information about city events. This paper describes a platform which focuses on processing messages posted in Twitter social network. Key issues here are the high throughput a large volume of data per second that needs to be processed, and the need to process ill formed natural language texts. With these in mind the platform has pipelined modules for robust, fast, real time tweet acquisition and storage, filtering of several kinds, natural language processing and sentiment analysis, that feed a final analysis and visualization module. A case study of sentiment analysis during the 2014 FIFA World Cup in Brazil is used to validate the effort made so far.
Pub.: 16 Dec '16, Pinned: 19 Sep '17
Abstract: One weakness of machine-learned NLP models is that they typically perform poorly on out-of-domain data. In this work, we study the task of identifying products being bought and sold in online cybercrime forums, which exhibits particularly challenging cross-domain effects. We formulate a task that represents a hybrid of slot-filling information extraction and named entity recognition and annotate data from four different forums. Each of these forums constitutes its own "fine-grained domain" in that the forums cover different market sectors with different properties, even though all forums are in the broad domain of cybercrime. We characterize these domain differences in the context of a learning-based system: supervised models see decreased accuracy when applied to new forums, and standard techniques for semi-supervised learning and domain adaptation have limited effectiveness on this data, which suggests the need to improve these techniques. We release a dataset of 1,938 annotated posts from across the four forums.
Pub.: 31 Aug '17, Pinned: 19 Sep '17
Abstract: The increasing popularity of wearable devices in recent years means that a diverse range of physiological and functional data can now be captured continuously for applications in sports, wellbeing, and healthcare. This wealth of information requires efficient methods of classification and analysis where deep learning is a promising technique for large-scale data analytics. Whilst deep learning has been successful in implementations that utilize high performance computing platforms, its use on low-power wearable devices is limited by resource constraints. In this paper, we propose a deep learning methodology, which combines features learnt from inertial sensor data together with complementary information from a set of shallow features to enable accurate and real-time activity classification. The design of this combined method aims to overcome some of the limitations present in a typical deep learning framework where on-node computation is required. To optimize the proposed method for real-time on-node computation, spectral domain pre-processing is used before the data is passed onto the deep learning framework. The classification accuracy of our proposed deep learning approach is evaluated against state-of-the-art methods using both laboratory and real world activity datasets. Our results show the validity of the approach on different human activity datasets, outperforming other methods, including the two methods used within our combined pipeline. We also demonstrate that the computation times for the proposed method are consistent with the constraints of real-time on-node processing on smartphones and a wearable sensor platform.
Pub.: 28 Dec '16, Pinned: 19 Sep '17
Abstract: Publication date: Available online 11 January 2017 Source:Metabolism Author(s): Pavel Hamet, Johanne Tremblay Artificial Intelligence (AI) is a general term that implies the use of a computer to model intelligent behaviour with minimal human intervention. AI is generally accepted as having started with the invention of robots. The term derives from the Czech word robota, meaning biosynthetic machines used as forced labour. In this field, Leonardo Da Vinci's lasting heritage is today's burgeoning use of robotic-assisted surgery, named after him, for complex urologic and gynecologic procedures. Da Vinci's sketchbooks of robots helped set the stage for this innovation. AI, described as the science and engineering of making intelligent machines, was officially born in 1956. The term is applicable to a broad range of items in medicine such as robotics, medical diagnosis, medical statistics, and human biology—up to and including today's “omics”. AI in medicine, which is the focus of this review, has two main branches: virtual and physical. The virtual branch includes informatics approaches from deep learning information management to control of health management systems, including electronic health records, and active guidance of physicians in their treatment decisions. The physical branch is best represented by robots used to assist the elderly patient or the attending surgeon. Also embodied in this branch are targeted nanorobots, a unique new drug delivery system. The societal and ethical complexities of these applications require further reflection, proof of their medical utility, economic value, and development of interdisciplinary strategies for their wider application.
Pub.: 11 Jan '17, Pinned: 19 Sep '17