akhaliq (AK) – Community Activity

Paper • 2409.16235 • Published about 16 hours ago • 6 •

EuroLLM: Multilingual Language Models for Europe

Paper • 2409.15700 • Published 1 day ago • 12 •

commented 3 papers about 4 hours ago

commented a paper about 5 hours ago

Making Text Embedders Few-Shot Learners

Paper • 2409.16283 • Published about 14 hours ago • 4 •

commented a paper about 6 hours ago

Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation

Paper • 2409.13910 • Published 4 days ago • 5 •

commented 5 papers 1 day ago

Zero-shot Cross-lingual Voice Transfer for TTS

Paper • 2409.15273 • Published 1 day ago • 7 •

MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors

Paper • 2409.14713 • Published 2 days ago • 20 •

Phantom of Latent for Large Language and Vision Models

Paper • 2409.14393 • Published 3 days ago • 6 •

MaskedMimic: Unified Physics-Based Character Control Through Masked Motion Inpainting

Paper • 2409.14340 • Published 3 days ago • 1 •

Self-Supervised Audio-Visual Soundscape Stylization

Paper • 2409.13598 • Published 5 days ago • 25 •

commented 6 papers 2 days ago

Prithvi WxC: Foundation Model for Weather and Climate

Paper • 2409.13346 • Published 5 days ago • 57 •

Imagine yourself: Tuning-Free Personalized Image Generation

Paper • 2409.13690 • Published 5 days ago • 10 •

Colorful Diffuse Intrinsic Image Decomposition in the Wild

Paper • 2409.13648 • Published 5 days ago • 8 •

V^3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians

Paper • 2409.13591 • Published 5 days ago • 12 •

Portrait Video Editing Empowered by Multimodal Generative Priors

Paper • 2409.13216 • Published 5 days ago • 17 •

MuCodec: Ultra Low-Bitrate Music Codec

New activity in RED-AIGC/StoryMaker 5 days ago

gradio demo

#2 opened 5 days ago by

Paper • 2409.12917 • Published 6 days ago • 107 •

commented 9 papers 5 days ago

Training Language Models to Self-Correct via Reinforcement Learning

9

Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

Paper • 2409.12961 • Published 6 days ago • 22 •

Paper • 2409.12960 • Published 6 days ago • 20 •

LVCD: Reference-based Lineart Video Colorization with Diffusion Models

Paper • 2409.12957 • Published 6 days ago • 16 •

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

Paper • 2409.12892 • Published 6 days ago • 5 •

3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt

Paper • 2409.12576 • Published 6 days ago • 14 •

StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation

Paper • 2409.12568 • Published 6 days ago • 44 •

InfiMM-WebMath-40B: Advancing Multimodal Pre-Training for Enhanced Mathematical Reasoning

4

FlexiTex: Enhancing Texture Generation with Visual Guidance

Paper • 2409.12431 • Published 6 days ago • 9 •

Paper • 2409.12532 • Published 6 days ago • 5 •

Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation

New activity in cerebras/chain-of-thought 5 days ago

streaming outputs

#3 opened 5 days ago by

New activity in akhaliq/dailypapershackernews 5 days ago

Create app.py

#2 opened 6 days ago by

guy1eyal

New activity in yanze/PuLID-FLUX 6 days ago

add developers local gradio demo section

#6 opened 6 days ago by

Paper • 2409.12183 • Published 7 days ago • 28 •

commented 4 papers 6 days ago

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.11901 • Published 7 days ago • 29 •

LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.12191 • Published 7 days ago • 63 •

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12139 • Published 7 days ago • 11 •

Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models

4

commented 8 papers 7 days ago

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published 8 days ago • 25 •

Paper • 2409.10819 • Published 8 days ago • 15 •

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Paper • 2409.11406 • Published 8 days ago • 22 •

Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11402 • Published 8 days ago • 54 •

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11340 • Published 8 days ago • 75 •

OmniGen: Unified Image Generation

Paper • 2409.11211 • Published 8 days ago • 7 •

SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction

Paper • 2409.10923 • Published 8 days ago • 10 •

Agile Continuous Jumping in Discontinuous Terrains

Paper • 2409.10568 • Published 11 days ago • 12 •

On the limits of agency in agent-based models

Paper • 2409.11367 • Published 8 days ago • 12 •

New activity in akhaliq/dailypapershackernews 7 days ago

dark mode

#1 opened 7 days ago by

hysts

commented 2 papers 7 days ago

OSV: One Step is Enough for High-Quality Image to Video Generation

Paper • 2409.10594 • Published 9 days ago • 34 •

Kolmogorov-Arnold Transformer

Paper • 2409.10173 • Published 9 days ago • 20 •

commented 3 papers 8 days ago

jina-embeddings-v3: Multilingual Embeddings With Task LoRA

Paper • 2409.08831 • Published 12 days ago • 4 •

Breaking reCAPTCHAv2

Paper • 2409.09214 • Published 11 days ago • 44 •

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

New activity in sambanovasystems/Llama3.1-Instruct-O1 9 days ago

use gr chatbot

#5 opened 9 days ago by

downgrade openai version

#4 opened 9 days ago by

fix gradio demo issue and not use chatbot component

#3 opened 9 days ago by

update for gradio

#2 opened 9 days ago by

use gradio

#1 opened 9 days ago by

Paper • 2409.08947 • Published 12 days ago • 11 •

commented 6 papers 9 days ago

A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis

Paper • 2409.08857 • Published 12 days ago • 29 •

InstantDrag: Improving Interactivity in Drag-based Image Editing

Paper • 2409.08615 • Published 12 days ago • 14 •

DrawingSpinUp: 3D Animation from Single Character Drawings

Paper • 2409.08514 • Published 12 days ago • 8 •

Apollo: Band-sequence Modeling for High-Quality Audio Restoration

Paper • 2409.08513 • Published 12 days ago • 10 •

Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection