Hacker Newsnew | past | comments | ask | show | jobs | submit | nickpsecurity's commentslogin

They mostly do that. They risked legal contamination by using Whisper-derived text and web text which might have gotchas. Other than that, it was a great collection for low-risk training.

Models today will be biased based on what's in their training data. If English, it will be biased heavily toward Western, post-1990's views. Then, they do alignment training that forces them to speak according to the supplier's morals. That was Progressive, atheist, evolutionist, and CRT when I used them years ago.

So, the OP model will accidentally reflect the biases of the time. The current, commercial models intentionally reflect specific biases. Except for uncensored models which accidentally have those in the training data modified by uncensoring set.


That is one of the reasons I want it done. We cant tell if AI's are parroting training data without having the whole, training data. Making it old means specific things won't be in it (or will be). We can do more meaningful experiments.

That would be an interesting experiment. It might be more useful to make a model with a cut off close to when copyrights expire to be as modern as possible.

Then, we have a model that knows quite a bit in modern English. We also legally have a data set for everything it knows. Then, there's all kinds of experimentation or copyright-safe training strategies we can do.

Project Gutenberg up to the 1920's seems to be the safest bet on that.


Coursera and Udemy have Math for Machine Learning Courses. Udemy is self-paced. If you need, you can pause to learn an unforseen prerequisite.

I bought John Krohn's Mathematical Foundations and Krista King's Statistics and Probability.


It repeats mixes of what people said and did in it's training data. What goes in is what its model says has the highest probability of filling in the blanks. There's even many lies and inaccurate data in the training data of most models.

"Garbage In, Garbage Out"

"You get out of it what you put into it."


i have a fear that some Evil ppl will fake the history by training LLM with wrong data, so future generation will struggle knowing the truth

a large amount of truth we know we learned because instead of consulting other oracles, we asked the universe about it; why would that path not remain open to your future generations?

Udemy and Coursera both have Math for Machine Learning courses that start with Linear Algebra. Then, Calculus and Probability and Statistics. They're often $25-50.

You might want to look at their outlines to see what they're teaching. Then, decide if you can do something similar and/or cheaper.


I don't know that we won anything past being useful to executives. Many benefits of life go to people based on social skills, image, and interests. Hacker knowledge or topics are still a negatibe that holds us back.

I always had to talk and act more like them to make progress in ways nerds usually don't. I had to hide my knowledge or skills mostly to focus on theirs. In other groups or places, I could nerd out with the nerds. I could sometimes do something from one group in the other but had to present it differently.

Back in high school, I think we wanted all the opportunities and attention we saw other people get while being ourselves. That didn't happen. I think most of us still can't have it. So, who won in the culture?

Thanks to Jesus Christ, I don't need it anymore. He gives us peace, joy, purpose, and new family. Also, reveals more quickly how evil those high-status groups were. I'm glad I couldn't become one of them. I still had to repent of what kind of person I became, though.


Thanks for sharing your thoughts. If you’re so inclined to share, could you please share the highly improbable event you mention in your bio which reawakened your faith? You don’t have to mention anything you don’t want to.

My full testimony is here:

https://gethisword.com/mystory.html

If summarizing some key points, I'd name at least a few:

1. I learned through over a decade of liberal activism, and history books, that something was fundamentally wrong with humanity, ruined all we did, and was unfixed for thousands of years.

2. While living in sin, I had a sharp realization that shook me to the bones that I was evil and would pay for it. That God or something like that would make it happen. Many things in my life started running through my mind. This is inexplicable for a devout atheist who literally pissed on a Gospel tract. (Biblically, it's called Holy Spirit conviction.)

3.Many random events happened in a short time that were each statistically unlikely. A few catastrophic immediately or looming. Every way I tried to solve them on my own failed in unlikely ways. The odds of this were astronomical against, it wss hopeless, and the clock was ticking fast. That forced me to pause to look up.

4. After praying about changing to a God I didn't believe in, a bunch of coincidences happened that started changing my circumstances. Those who were helping were all Christian, too, but most of my environment wasn't.

5. I read a Bible I kept to make fun of Christians. I had studied and debated many religions. This time, it was different. It's like it spoke to me in a piercing way where I knew it was true. I still didn't want to change and so prayed if I had to literally believe it.

6. At work, I saw a flash of light go through the whole building, felt like a bolt of lightening, heard a coworker's voice on the radio asking for help, and it was distorted in an angelic way. Minutes later, the situation replayed word for word even with the same tone and spacing. That's called a prophecy or revelation of future events. Atheist science said that was impossible. Again, with a Christian involved.

7. Upon trying to believe in God and do good works, my PTSD and night terrors were instantly cured. Not a placebo cuz I thought I'd have a hard life but live better and avoid extreme problems. Imagine my shock when I felt deep peace for the first time in forever. Peace that psychologists' and neurologists' writings said I'd never feel.

8. God caused random events (providence) to put me in a Bible study. They told me it's not about trying to earn our future by good works because we keep sinning all our lives. The Gospel is that we deserve justice (death/hell). While sinners, Christ lived the perfect life, died for our sins, and rose again to earn a future for us. We receive it as a gift by faith and repentance. Then, live right with God's help in gratitude.

Nothing's been the same since I committed to Jesus Christ. I've since met countless people from all walks of life transformed by the same nessage with similar effects and stories. That makes it empirically true with tens of millions of data points. Then, stepping out of my echo chambers, I got to learn more truth about history, science, politics, and more.

It was quite a journey that's far from over. I hope more people repent and put their faith in Christ. I hope this helps you because that's why He commands us to share it.


I appreciate your candid response

Quick, random question. I heard way back that SPARK was getting safe pointers in response to Rust's borrow checker.

Has full Ada solved their unsafe de-allocation problem in a way that's comparable to the borrow checker's guarantees?


Yes, it was introduced many years ago and has been in production for quite some time now. You can completely eliminate use-after-free, double-free, dangling pointers, memory leaks, and null dereferences.

You should take another look, lots have changed in this scene.


I considering it now. Aside from correctness verification, the main reason we'd use a limited language for packet inspection is in case the policy is malicious. How often is that the case?

For most people, they trust most or all of the code running on their machine. They certainly trust their firewall policy to not be malware. If you already trust it, using a better, safe language might be helpful. In many cases, eBPF will be fine.

This isn't the first time this has been done. SPIN was an operating system in Modula-3 that allowed type-safe linking of code into the kernel, balancing safety and performance.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: