Given a YouTube video URL, the program grabs its comments using YouTube's API, then analyzes them using Llama 2 and classifies them in 3 categories according to their content as Positive, Negative and Neutral. Finally, it returns a data set containing comments' evaluation and information about the video and commenters. Since this is a containerized project, it can be executed anywhere using Docker.
check Github repoThis project shows an example of a working ETL task running on AWS with containerization approach. At a high level, this app extracts raw information, transforms it by performing aggregations and stores it using the optimized file format from Apache, parquet.
check Github repoThis is an end-to-end application with the purpose of ingesting, processing and monitoring crypto price data in real-time.
This application implements several technologies:
- Asyncronous web socket connections to data provider's API.
- Event streaming using Apache Kafka to transport data and communicate all the systems.
- Data engineering with Pandas.
-Web dashboard design and creation with Dash for live monitoring.
This project developed an algorithm to predict anomalies in ATMs, given dates, holidays and geo-location.
It preprocess data with mean/standard deviation normalization, Principal Component Analysis (PCA), applies k-means Clustering Analysis, Support Vector Machines for anomaly detection and Time Series Analysis with Facebook Prophet library.
Finally, it returns a ranking score showing the likelihood of an anomaly happening given a date or holiday and a location to efficiently focus resources mitigating the issue.
This was developed using Python, Tensorflow, Sci-kit learn and Pandas.
This project consists of a Convolutional Neural Network for classification. The dataset consists of 4000 images of 4 dogs (1000 pictures each) and this predicts the dog name based on its image.
Architecture used involves convolutional, pooling and dense layers training 300k+ parameters with an ADAM optimizer.
This CNN yields more than 98% accuracy and is shown in an interface.