1. Personal Projects

1) From-scratch PyTorch Implementations of AI papers

yeondo	nonmun	naeyong
Vision
2014	VAE (Kingma and Welling)	[] Training on MNIST [] Visualizing Encoder output [] Visualizing Decoder output [] Reconstructing image
2015	CAM (Zhou et al.)	[] Applying GoogLeNet [] Generating 'Class Activatio Map' [] Generating bounding box
2016	Gatys et al.	[] Experimenting on input image size [] Experimenting on VGGNet-19 with Batch normalization [] Applying VGGNet-19
	YOLO (Redmon et al.)	[] Model architecture [] Visualizing ground truth on grid [] Visualizing model output [] Visualizing class probability map [] Loss function [] Training on VOC 2012
	DCGAN (Radford et al.)	[] Training on CelebA at 64 x 64 [] Sampling [] Interpolating in latent space [] Training on CelebA at 32 x 32
	Noroozi et al.	[] Model architecture [] Chromatic aberration [] Permutation set
	Zhang et al.	[] Visualizing empirical probability distribution [] Model architecture [] Loss function [] Training
2014 2017	Conditional GAN (Mirza et al.) WGAN-GP (Gulrajani et al.)	[] Training on MNIST
2016 2017	VQ-VAE (Oord et al.) PixelCNN (Oord et al.)	[] Training on Fashion MNIST [] Training on CIFAR-10 [] Sampling
2017	Pix2Pix (Isola et al.)	[] Experimenting on image mean and std [] Experimenting on `nn.InstanceNorm2d()` [] Training on Google Maps [] Training on Facades [] higher resolution input image
	CycleGAN (Zhu et al.)	[] Experimenting on random image pairing [] Experimenting on LSGANs [] Training on monet2photo [] Training on vangogh2photo [] Training on cezanne2photo [] Training on ukiyoe2photo [] Training on horse2zebra [] Training on summer2winter_yosemite
2018	PGGAN (Karras et al.)	[] Experimenting on image mean and std [] Training on CelebA-HQ at 512 x 512 [] Sampling
	DeepLabv3 (Chen et al.)	[] Training on VOC 2012 [] Predicting on VOC 2012 validation set [] Average mIoU [] Visualizing model output
	RotNet (Gidaris et al.)	[] Visualizing Attention map
	StarGAN (Yunjey Choi et al.)	[] Model architecture
2020	STEFANN (Roy et al.)	[] FANnet architecture [] Colornet architecture [] Training FANnet on Google Fonts [] Custom Google Fonts dataset [] Average SSIM [] Training Colornet
	DDPM (Ho et al.)	[] Training on CelebA at 32 x 32 [] Training on CelebA at 64 x 64 [] Visualizing denoising process [] Sampling using linear interpolation [] Sampling using coarse-to-fine interpolation
	DDIM (Song et al.)	[] Normal sampling [] Sampling using spherical linear interpolation [] Sampling using grid interpolation [] Truncated normal
	ViT (Dosovitskiy et al.)	[] Training on CIFAR-10 [] Training on CIFAR-100 [] Visualizing Attention map using Attention Roll-out [] Visualizing position embedding similarity [] Interpolating position embedding [] CutOut [] CutMix [] Hide-and-Seek
	SimCLR (Chen et al.)	[] Normalized temperature-scaled cross entropy loss [] Data augmentation [] Pixel intensity histogram
	DETR (Carion et al.)	[] Model architecture [] Bipartite matching & loss [] Batch normalization freezing [] Training on COCO 2017
2021	Improved DDPM (Nichol and Dhariwal)	[] Cosine diffusion schedule
	Classifier-Guidance (Dhariwal and Nichol)	[] Training on CIFAR-10 [] AdaGN [] BiGGAN Upsample/Downsample [] Improved DDPM sampling [] Conditional/Unconditional models [] Super-resolution model [] Interpolation
	ILVR (Choi et al.)	[] Sampling using single reference [] Sampling using various downsampling factors [] Sampling using various conditioning range
	SDEdit (Meng et al.)	[] User input stroke simulation [] Applying CelebA at 64 x 64 [] Total repeats. [] VE SDEdit. [] Sampling from scribble. [] Image editing only on masked regions.
	MAE (He et al.)	[] Model architecture for self-supervised pre-training [] Model architecture for classification [] Self-supervised pre-training on ImageNet-1K [] Fine-tuning on ImageNet-1K [] Linear probing
	Copy-Paste (Ghiasi et al.)	[] COCO dataset processing [] Large scale jittering [] Copy-Paste (within mini-batch) [] Visualizing data [] Gaussian filter
	ViViT (Arnab et al.)	[] 'Spatio-temporal attention' architecture [] 'Factorised encoder' architecture [] 'Factorised self-attention' architecture
2022	CFG (Ho et al.)
Language
2017	Transformer (Vaswani et al.)	[] Model architecture [] Visualizing position encoding
2019	BERT (Devlin et al.)	[] Model architecture [] Masked language modeling [] BookCorpus data processing [] SQuAD data processing [] SWAG data processing
	Sentence-BERT (Reimers et al.)	[] Classification loss [] Regression loss [] Constrastive loss [] STSb data processing [] WikiSection data processing [] NLI data processing
	RoBERTa (Liu et al.)	[] BookCorpus data processing [] Masked language modeling [] BookCorpus data processing ('SEGMENT-PAIR' + NSP) [] BookCorpus data processing ('SENTENCE-PAIR' + NSP) [] BookCorpus data processing ('FULL-SENTENCES') [] BookCorpus data processing ('DOC-SENTENCES')
2021	Swin Transformer (Liu et al.)	[] Patch partition [] Patch merging [] Relative position bias [] Feature map padding [] Self-attention in non-overlapped windows [] Shifted Window based Self-Attention
2024	RoPE (Su et al.)	[] Rotary Positional Embedding
Vision-Language
2021	CLIP (Radford et al.)	[] Training on Flickr8k + Flickr30k [] Zero-shot classification on ImageNet1k (mini) [] Linear classification on ImageNet1k (mini)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jongbeom Kim KimRass

Achievements

Achievements

Block or report KimRass

1. Personal Projects

1) From-scratch PyTorch Implementations of AI papers

2) Fine-tuning 'EasyOCR' on the 'gonggonghaengjeongmunseo OCR' Dataset Provided by 'AI-Hub'

3) Recognizing Book Content Using the 'CLOVA OCR API'

4) A Rule-based Algorithm for Solving Edge-matching Puzzles of Arbitrary Sizes Using L2 Distance

5) A 'FastAPI'-based API for Performing Semantic Segmentation Using a 'DeepLabv3' Pretrained on the 'VOC2012' dataset

Pinned Loading

Uh oh!