Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
ModelForge's submissions
login
1.
A Researcher's Field Guide to Non-Standard LLM Architectures
(
sebastianraschka.com
)
2 points
by
ModelForge
4 days ago
|
past
|
discuss
2.
Explanation of Gated DeltaNet (Qwen3-Next and Kimi Linear)
(
github.com/rasbt
)
3 points
by
ModelForge
5 days ago
|
past
|
discuss
3.
The Core Components of Modern LLMs and the Models Beyond Transformers [video]
(
youtube.com
)
3 points
by
ModelForge
12 days ago
|
past
|
discuss
4.
Popular Attention Alternatives: GQA, MLA, SWA
(
sebastianraschka.com
)
4 points
by
ModelForge
24 days ago
|
past
5.
Multi-Head Latent Attention
(
sebastianraschka.com
)
4 points
by
ModelForge
26 days ago
|
past
6.
Thinking Machines Lab Co-Founder Departs for Meta
(
wsj.com
)
7 points
by
ModelForge
28 days ago
|
past
7.
OpenAI's internal Slack messages could cost it billions in copyright suit
(
sherwood.news
)
8 points
by
ModelForge
29 days ago
|
past
|
1 comment
8.
LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge
(
sebastianraschka.com
)
4 points
by
ModelForge
34 days ago
|
past
9.
Gemma 3 270M re-implemented in pure PyTorch for local tinkering
(
github.com/rasbt
)
417 points
by
ModelForge
80 days ago
|
past
|
57 comments
10.
GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2
(
sebastianraschka.com
)
490 points
by
ModelForge
3 months ago
|
past
|
97 comments
11.
LLM Research Papers: The 2024 List
(
sebastianraschka.com
)
5 points
by
ModelForge
10 months ago
|
past
12.
Scaling Test-Time Compute with Open LLM Models
(
huggingface.co
)
3 points
by
ModelForge
10 months ago
|
past
Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: