Hacker Newsnew | past | comments | ask | show | jobs | submit | DanielBryars's commentslogin

Great idea, really cool.

I noticed in the example you shared it highlighted the choice of SHA1 for further attention, because it was deprecated. I think thats good. In this case, lets say I do actually want to use it and pop a comment above it, "SHA1 deliberate, partitioning only, no security exposure" I presume the LLM would take that into account. I'll try it out when I can.


Like LEAN4 ?


Yes, if the ruling allowing the GameGenie is about the freedom to tinker, I see this being an effective defence for building an AI myself from my own books for my own use. But, removing the need to buy the book in the first place is the key problem that the article seems to ignore.


FYI: On the login, it tripped me up a couple of times because the username is case sensitive. There is a tradeoff between security, useability, and support requests; the input is labelled email, and email addresses are usually not case sensitive (and as email addresses used as email addresses are never case sensitive) so it confused me.


TiL. I always assumed emails were case sensitive, and doubly so if used as a username. I find it strange that you even discovered this 'wrong' behaviour on the site in question: you purposefully typed your email address with different casing when logging in vs. registering?


"I always assumed emails are case sensitive"

Wikipedia has a good summary on what is valid. https://en.wikipedia.org/wiki/Email_address

From the second paragraph:

Although the standard requires the local-part to be case-sensitive,[1] it also urges that receiving hosts deliver messages in a case-independent manner,[2] e.g., that the mail system in the domain example.com treat John.Smith as equivalent to john.smith; some mail systems even treat them as equivalent to johnsmith.[3]

You'll find the footnote links at the Wikipedia article, I'm not going to paste the here. So yes, if my email address is my username, then I would expect it to work the same in uppercase or lowercase. If my username is "like an email address, but not an email address" then you make the rules for your site.


I have never had a email server be case sensitive, and often use that for mail filtering: myuSer@ - the big "S" is for spam!

In line with with that, I would expect the login to not be case sensitive when it accepts an email.


I'm curious, was is profitable? I don't mean to diminish other factors for keeping it alive, just wondering?


Less profitable than a comparable gas turbine because it couldn't perform the same grid role of peaker plants. Make no mistake it was gas which killed coal here not wind.

A coal plant even with all of the modern upgrades is and always has been happiest as a baseload generator. It takes about 4-6 hours for a coal plant to come up from cold start, compared to about 5 mins for a gas peaker plant. This means you can use it for planned/predicted grid peaks but you'll have to run during some unprofitable times to do so.

Essentially coal has the same problem as wind, it's producing at the wrong times. If you want it to be really profitable you need pumped storage and batteries to hold that energy for peaks, something we're still short of in the UK.


I don't know for sure, but I'd guess so. Use of coal as an energy source is fairly commonplace. Germany is an example of a country where coal makes up a huge make up of it's power.

It's probably becoming less profitable though- there aren't a lot of guarantees because coal takes a long time to start up, and the UKs energy price is volatile due to the ammount of wind energy in the system.

Probably a reason why natural gas continues to be a fossil fuel with a lot of use in the UK. It's very quick to turn on and off at times when wind is low/high.


Last year Germany had more power from wind (32%) than coal (28%).

The UK (or EU?) introduced air pollution limits and carbon emission taxes. That made coal power unprofitable.

Ratcliffe had a supplemental income as an emergency producer of power, which presumably left it profitable, but it is no longer needed for that.


The sentiment is right, and sure, there are some differences, but please dont think your 50s is some geriatric period of your life where hiking across the alps/europe is not thoroughly enjoyable!


What's the utility of defining the "Error" exception. Why not use an existing one, say InvalidOperationException, or a plain Exception. Is making your own better practice?


There is no utility. It's perhaps written for JavaScript developers who are used to Error.. but it's not idiomatic C#. Might be indicative of a copilot too.

The use of a class-scoped `StringBuilder` that only one method uses, and `ReadQuotedColumn`/`ReadNonQuotedColumn` yielding one character at a time, rather than accepting a the builder isn't a good sign either (for efficiency). Or casting everything to a `char` (this won't support UTF8), or assuming an end quote followed by anything (:71) is valid way to end a field.


C# `char` is a UTF-16 code unit. It does not indicate a byte which is just `byte`.

Having StringBuilder be a private field on the parser instance is not an issue either - it is simply reused.


Iterating over the `char`s does not support the full range of what can be stored in a C# string (for instance, UTF-8 graphemes that are serialized as surrogate pairs are usually two `char`s in a C# string.

.Net provides a TextElementEnumerator that will iterate over graphemes instead: https://learn.microsoft.com/en-us/dotnet/api/system.globaliz...

There's a fairly comprehensive guide to working with .net character encodings at https://learn.microsoft.com/en-us/dotnet/standard/base-types... .


The return value of StreamReader.Read() will always be within bounds of -1 and char.MaxValue.

All surrogate pairs will be drained into the StringBuilder, working correctly. Most implementations usually agree that torn UTF-16 surrogate pairs (which are strictly the code points outside of basic multilingual plane) may exist in the input and will be passed as is, which is different to what UTF-8 implementations choose (Rust is strict with this, Go lets you tear code points arbitrarily).

We, as a community, can do better than to jump to immediate criticism of this type.


If you (a consuming dev) want the world's smallest (in your code) - use the .net built in parser[0]. Bonus, it's RFC4180 compliant.

If you (competing/learning) want to write the world's smallest (code golf style)... this isn't it, and has some weird superfluous lines (if that's your measure - per the original question).

If you (learning) want to write an efficient parser.. this isn't it. You don't need a StringBuilder, you can seek the Stream to collect the (already formed) strings directly from source vs char-by-char memory copy and rebuild. Yes; that limits your stream choices, but since the example/tests only use FileStreams (which are seekable) you might not come across other kinds. If you need to use un-seekable streams, then you'll need to use a large enough buffer.

[0]: https://learn.microsoft.com/en-us/dotnet/api/microsoft.visua...


This is not a correct link (it refers to VB.NET). There are better parsers out there (Sep).

I'm not sure what is your point but it certainly misses the idea behind this HN submission and makes me sad as it would be nice to see words of encouragement in .NET submissions here instead.


It's an assembly with "Microsoft.VisualBasic" in the name, but it shipped as part of every version of .NET to date, and is perfectly usable from C#. In fact, I would be very surprised if there aren't vastly more uses of this API from C#, since it's a very old trick of the trade.

What GP is saying is that, given that it is already included in the standard class library, it's always the cheapest option wrt size of your shipping app. So it should arguably be the default choice for any .NET dev unless they either need better performance or some more exotic requirements wrt input format.


What is it with .NET or C# submissions (but I suppose other languages are not immune either) that attracts this type of replies, which miss the point behind a particular piece of code, trivial or not?

Yes, there are existing implementations, many of which are incomparably better, one of which ships with default project SDK (even if it is effectively obsolete[0]). But surely offering a competitive implementation that intends to replace existing solutions wasn't the purpose of this?

Either way, I'm not the author of the code and have already spent enough (free) time in the last 8 months working on a string library which has performant parsing as one of the project goals[1].

[0] https://github.com/dotnet/runtime/tree/main/src/libraries/Mi...

[1] https://github.com/U8String/U8String


The unit tests have an emoji test (which uses a surrogate pair). I thought I would have to use Runes, but it's not necessary. https://github.com/kjpgit/SmallestCSVParser/blob/master/Smal...


You imply that a string, reversed, would have the same length as the original.

This is not true.


Where are they implying this and why would the strings not have the same length? Is there normalization implied somewhere?


If they weren't reversing it, what other operation would separate grapheme clusters?


> Having StringBuilder be a private field on the parser instance is not an issue either - it is simply reused.

It doesn’t matter for this API, but it is a code smell. It makes the class not reentrant.

Talking of the API, I would make it simpler to use and more idiomatic by making the entire public API

   static IEnumerable<List<String>> parse(StreamReader sr)
That call would store the parser state (currently just the StreamReader and that reused StringBuilder) in a private inner class. There would not be a constructor of the publicly visible class, removing that code smell.


I will add a micro benchmark to see if the `yield return` is slowing things down, compared to just calling _sb.Add() inside Read*(). I will also see if it looks cleaner that way. To be honest, the `yield return` is currently in there just because I thought it's "cool".


30% performance improvement after removing the `yield return`, and readability is probably better too.


It's good practice to throw an exception from your own namespace if you're writing a library.

You don't want to expose an implementation detail like some specific exception as part of your public API and have to worry about breaking that later.

You could overload some built in exception but IMO that's not the best practice. You muddy your API and a caller has to wrap your exception if they want to bubble it up and catch it specifically, anyway.


> Why not use ... a plain Exception.

It is forbidden.

https://learn.microsoft.com/en-us/dotnet/standard/exceptions...

> Exception ... None (use a derived class of this exception).

https://learn.microsoft.com/en-us/dotnet/standard/design-gui...

> DO NOT throw System.Exception or System.SystemException.


It’s a guideline not to. It’s not a hard rule or forbidden.


I'm not seeing where it says not to throw InvalidOperationException.


InvalidOperationException means "the object is in an inappropriate state". That does not describe a parse error.

C# conventions for exceptions are admittedly a bit confusing. There are a handful of very specific scenarios where you're supposed to use a built-in exception (most commonly ArgumentException). For everything else, you want to define your own type.


The state of the stream, such that it's pointing to an illegal character, actually does seem to be invalid though. Maybe this is an overly pedantic argument. I've been writing quite a bit of c# for over a decade and have basically been doing this the whole time. I thought that I knew how to use exceptions, but it seems I do not.


If you want to recover from CSV errors you should ideally throw InvalidCSVException.


I had to do this 10 years ago for a SAAS application. It was standard procedure from the customer in question (a large multinational corporation) for "critical applications" - and I could understand their motivation. However, on renewal of the contract, the escrow clause was dropped - I'm not sure if this was because we were more trusted, or their policies changed (I think the cost was a factor).

Many other large customers consumed our services, but none of those have asked for an escrow - some have contracted for "special ways" to remove their data (for example direct access to database backups and so on) in the case that we would go insolvent - I'm not sure that legal mechanism this used.

For the customer in question they had several "levels" of escrow - and in this case they wanted the full escrow, which is more than just a dump of the code - it required all code, all dependencies, all bootstrap data, all configuration files, all build tools, and detailed instructions for building and running the app. An external company worked with us so that they could independently build the application, and witness it running. It was very expensive, very disruptive, very time consuming (it took about 3 days of prep, and 5 days with the external company). I remember it felt like a life time. The customer picked up the bill for the Escrow, that included the cost of the independent company, and our time (but not the opportunity cost).

In my opinion they are of very little value (for example the code continually goes out of date, who's going to run the service because they don't have the skills). In my experience it was a total PITA, and personally I'd avoid it, and try as hard as I could to use a different device to provide the assurance that they need (e.g. contracting that they can access their data in the event of insolvency, or at a push putting the built artifacts and runtime configurations into escrow).


Thanks for sharing! It does seem like a lot of trouble for little if any benefit.


5 days. This is very short imo.


"Fatal error: The table 'sessions' is full query"

Anyone know who runs this? It's a great website, I wonder if they know it's down?


Do you happen to know when this started?


No sorry, I first noticed it today around 16:00 UTC. But that's the first time I have visited the site for months.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: