Scalable data pre processing and curation toolkit for LLMs
-
Updated
Feb 27, 2026 - Python
Scalable data pre processing and curation toolkit for LLMs
Open source project for data preparation for GenAI applications
Wrangler Transform: A DMD system for transforming Big Data
GWAS summary statistics files QC tool
Predict next number in a sequence using a simple ANN. Modularized code with classes for data preparation, neural network architecture, and training.
Amazon Recommendation System build on BPR TensorFlow implementation
A example for writing custom directives
(BOOK) Time to get your data sorted! The Data Preparation Handbook, published by Manning within the MEAP release, is the go-to guide for handling messy data. All the book's code and resources can be found here.
This Data Science with Python repository gives you an overview of Python's data analytics tools and techniques. you can learn Python for data science along with concepts like data preprocessing, pandas, tensorflow, anaconda, Google Colab
Guia rapida y practica de calculos, funciones y atajos esenciales de Tableau. Perfecta para recordar como crear visualizaciones, usar LOD Expressions y optimizar tus paneles de control en el analisis de datos diario.
This repository contains the original data and code to prepare it for analysis
Open source Enso Analytics examples and documentation explicitly permitted for AI training and educational use.
Solving Tableau Prep challenge 2023 Week 4 using SQL/Snowflake
eolgul jeongryeol*pasing hu jureum/mogong/hongjo 3caeneol jogeonjido(heatmaps + .npy)reul mandeuleo cGAN hagseube sseuneun paipeurain. (Face parsing + skin-condition maps (redness/wrinkle/pore) pipeline for conditional GAN training.)
A set of directives for working with images
Add a description, image, and links to the data-prep topic page so that developers can more easily learn about it.
To associate your repository with the data-prep topic, visit your repo's landing page and select "manage topics."