The Silent Cartographer

Given a YouTube video URL, the program grabs its comments using YouTube's API, then analyzes them using Llama 2 and classifies them in 3 categories according to their content as Positive, Negative and Neutral. Finally, it returns a data set containing comments' evaluation and information about the video and commenters. Since this is a containerized project, it can be executed anywhere using Docker.

check Github repoimage

AWS-powered ETL

This project shows an example of a working ETL task running on AWS with containerization approach. At a high level, this app extracts raw information, transforms it by performing aggregations and stores it using the optimized file format from Apache, parquet.

check Github repoimage

Crypto Visualizer

This is an end-to-end application with the purpose of ingesting, processing and monitoring crypto price data in real-time.
This application implements several technologies:
- Asyncronous web socket connections to data provider's API.
- Event streaming using Apache Kafka to transport data and communicate all the systems.
- Data engineering with Pandas.
-Web dashboard design and creation with Dash for live monitoring.

check Github repoimage

Anomaly detection in ATMs

This project developed an algorithm to predict anomalies in ATMs, given dates, holidays and geo-location.
It preprocess data with mean/standard deviation normalization, Principal Component Analysis (PCA), applies k-means Clustering Analysis, Support Vector Machines for anomaly detection and Time Series Analysis with Facebook Prophet library.
Finally, it returns a ranking score showing the likelihood of an anomaly happening given a date or holiday and a location to efficiently focus resources mitigating the issue.
This was developed using Python, Tensorflow, Sci-kit learn and Pandas.

ai

DoggoNet

This project consists of a Convolutional Neural Network for classification. The dataset consists of 4000 images of 4 dogs (1000 pictures each) and this predicts the dog name based on its image.
Architecture used involves convolutional, pooling and dense layers training 300k+ parameters with an ADAM optimizer.
This CNN yields more than 98% accuracy and is shown in an interface.

check Github repo
doggonet