I'm a backend software developer working primarily with Python. I have experience with AI libraries such as Scikit Learn and others, especially NLP-related ones such as spaCy and Allen NLP.
AWS is my primary cloud provider, with special focus on serverless architectures using Lambda, API Gateway, DynamoDB, Aurora Serverless, S3, Kinesis, Athena and others.
Fully automated machine learning pipeline for news articles monitoring, feature extraction, NLP parsing, text classification, data enr...
Fully automated machine learning pipeline for news articles monitoring, feature extraction, NLP parsing, text classification, data enrichment with multiple external APIs (geolocation, corporate data, among others). Used AWS Lambda for compute processing and multiple datastores: Amazon S3 (data lake), MySQL (analytical querying), and DynamoDB (highly-scalable primary DB). Built an internal User Interface for data tagging for machine learning models, as well as human-analysis and vetting of machine learning outputs. A second User Interface was built for the company customers to consume the data with full-text search and faceting capabilities (using AWS CloudSearch), as well as export to Microsoft Excel.
Created a fully-automated pipeline to collect, extract and enrich data to build a database with corporate announcements around the wor...
Created a fully-automated pipeline to collect, extract and enrich data to build a database with corporate announcements around the world, including two frontend applications, one for internal analysts to curate the data and another for end users to query and interact with it.
Text announcements are collected from thousands of sources on a continuous basis, NLP is used for named-entity recognition and part-of-speech tagging, and machine learning models classify each part of the text into several categories.