I can't believe I'm wasting my time on another testing debate. Speaking as a for...

drchopchop · on July 9, 2020

One of the dark arts of being an experienced developer is knowing how to calculate the business ROI of tests. There are a lot of subtle reasons why they may or may not be useful, including:

- Is the language you're using dynamic? Large refactors in Ruby are much harder than in Java, since the compiler can't catch dumb mistakes

- What is the likelihood that you're going to get bad/invalid inputs to your functions? Does the data come from an internal source? The outside world?

- What is the core business logic that your customers find the most value in / constantly execute? Error tolerances across a large project are not uniform, and you should focus the highest quality testing on the most critical parts of your application

- Test coverage != good testing. I can write 100% test coverage that doesn't really test anything other than physically executing the lines of code. Focus on testing for errors that may occur in the real world, edge cases, things that might break when another system is refactored, etc.

msclrhd · on July 9, 2020

I now tend to focus on a black box logic coverage approach to tests, rather than a white box "have I covered every line of code" approach. I focus on things like format specifications, or component contract definitions/behaviour.

For lexer and parser tests, I tend to focus on the EBNF grammar. Do I have lexer test coverage for each symbol in a given EBNF, accepting duplicate token coverage across different EBNF symbol tests? Do I have parser tests for each valid path through the symbol? For error handling/recovery, do I have a test for a token in a symbol being missing (one per missing symbol)?

For equation/algorithm testing, do I have a test case for each value domain. For numbers: zero, negative number, positive number, min, max, values that yield the min/max representable output (and one above/below this to overflow).

I tend to organize tests in a hierarchy, so the tests higher up only focus on the relevant details, while the ones lower down focus on the variations they can have. For example, for a lexer I will test the different cases for a given token (e.g. '1e8' and '1E8' for a double token), then for the parser I only need to test a single double token format/variant as I know that the lexer handles the different variants correctly. Then, I can do a similar thing in the processing stages, ignoring the error handling/recovery cases that yield the same parse tree as the valid cases.

hnick · on July 10, 2020

I think you missed an important one, which is: how much do bugs even matter?

A bug can be critical (literally life-threatening) or unnoticeable. And this includes the response to the bug and what it takes. When I write code for myself I tend to put a lot of checks and crash states rather than tests because if I'm running it and something unexpected happens, I can easily fix it up and run it again. That doesn't work as well for automated systems.

monksy · on July 9, 2020

You should understand when those tests are low effort: Look for other frameworks that help you to develop those tests easier or frameworks that remove that requirement for you. I.e. Lambok for generation of getters/setters. You only have to unit test code that you wrote.

High test coverage comes from a history of writting tests there. Sadly people include feature and functional tests in the coverage.

jariel · on July 9, 2020

There's an easier answer and that is - as an experienced programmer - don't write any tests for your 'toy project' - at least not at the start.

The missing bit in the discussion is 1) churn, and 2) a devs ability to write fairly clean code.

Early stage and 'toy' projects may change a lot, in fundamental ways. There maybe total re-writes as you decide to change out technologies.

During this phase, it's pointless to try to 'harden' anything because you're not sure what it's entirely supposed to do, other than at a high level.

Trying Amazon Dynamo DB, only to find a couple weeks in that it's not what you need ... means it probably wouldn't make sense to run it through the gamut of tests.

Only once you've really settled on an approach, and you start to see the bits of code that look like they're not going to get tossed, does it make sense to start running tests.

Of course the caveat is that you'll need to have enough coding experience to move through the material quickly, in that, no single bit of code is a challenge, it's just 'getting it on the screen' takes some labour. The experience of 'having done it already many times' means you know it's 'roughly going to work'.

I usually try to 'get something working' before I think too hard about testing, otherwise you 3x the amount of work you have to do, most of which may be thrown out or refactored.

Maybe another way of saying it, is if a dev can code to '80% accuracy' - well, that's all you need at the start. You just want the 'main pieces to work together'. Once it starts to take shape, you've got to get much higher than that, testing is the way to do that.

LouisSayers · on July 9, 2020

This is the approach I take as well, and also think about it in terms of “setting things in stone”.

When you’re starting out a project and “discovering” the structure of it, it makes very little sense to lock things in place, especially when manual testing is inexpensive.

Once you have more confidence in your structure as it grows you can start hardening it, reducing the amount of manual testing you do along the way.

People that have hard and fast rules around testing don’t appreciate the lifecycle of a project. Different times call for different approaches, and there are always trade offs. This is the art of software.

whb07 · on July 9, 2020

I agree with all your points. Have you looked at any strongly typed functional language from ML like Ocaml, F#, Rust, or say similar like Haskell?

If you do make a slight tweak somewhere, the compiler will tell you there’s something broken in obscure place X that you would find out at runtime say with Ruby or Python.

THATS the winning formula. I’ve written so many tests for Python ensuring a function’s arguments are validated rather than the core logic/process of it.

marcosdumay · on July 9, 2020

> THATS the winning formula.

Not so fast. For some problems it's great, for other ones it's not.

Have you tried writing numeric or machine leaning core in Haskell? You'll notice that the type system just doesn't help you enforce correctness. Have you tried writing low level IO? The logic is too complex to capture on types, if you try to use them you'll have a huge problem.

wizzwizz4 · on July 9, 2020

> Have you tried writing low level IO? The logic is too complex to capture on types, if you try to use them you'll have a huge problem.

Rust's got a very Haskell-like type system, but it's a systems programming language. People are literally writing kernels in it. I think this is a pure-functional-is-a-bad-way-to-do-real-time-I/O thing, not a typing thing.

steveklabnik · on July 9, 2020

While this is true in some senses, Rust's type system is very different than Haskell's when it comes to handling IO.

That said, I don't think it's impossible to type IO. https://lexi-lambda.github.io/blog/2020/01/19/no-dynamic-typ... isn't the same problem, but it's related.

marcosdumay · on July 9, 2020

Hum... Pure functional is a bad way to do real time I/O, but my point was about types.

If you try to verify the kind of state machines that low level I/O normally use with Haskell-like types, you will gain a huge amount of complexity and probably end with more bugs than without.

wizzwizz4 · on July 10, 2020

Low-level I/O doesn't seem to have that much complexity, unless you're trying to handle all of the engineers' levels of abstraction at once.

Let's say you're writing a /dev/console driver for an RS-232 connection. Trying to represent "ring indicator", "parity failure", "invalid UTF-8 sequence", "keyboard interrupt", "hup" and "buffer full" at the same level in the type system will fail abysmally, but that's not a sensible way of doing it.

I could definitely implement this while leveraging the power of Rust's type system – Haskell would be a stretch, but only because it's side-effect free and I/O is pretty much all side-effects.

claudiusd · on July 9, 2020

I have only done a little bit but I know exactly what you're talking about and it's great.

whb07 · on July 9, 2020

Really give it a go! It is beyond worldly. If you think Typescript is great, then ocaml/f# will make it look inferior.

If you're doing React + Typescript give Reasonml which is a syntax sugar on top of Ocaml that compiles using bucklescript a go. Ocaml has the fastest compiler out there.

[0] https://reasonml.github.io/

goostavos · on July 9, 2020

You could always go even further to the FP darkside and join the Purescript community >:)

whb07 · on July 9, 2020

How’s the tooling for that? Haskell has the “best” compiler and garbage tooling that should be built on top of the ol’ rolls Royce engine it’s rocking on.

Meanwhile the plugins and IDE integrations for Reason/Ocaml and F# are ready to go from the start and work pretty well.

tunesmith · on July 9, 2020

Just a data point, with my current team, everyone jumped right in and wrote code and wrote tests from the start. The tests were integration tests that depended on the test database. Worked great at first, but then tests started failing sporadically as it grew. Turning off parallelism helped a bit, but not entirely. Stories starting taking longer too, where features entailed broad changes - it felt like every story was leading to merge conflicts and interdependency, where one person didn't want to implement their fix until someone else finished something that would change the code they were going to work on.

So then I came along and said, "hey, why don't we have any unit testing?" and it turns out because it was pretty impossible to write unit tests with our code. So I refactored some code and gave a presentation on writing testable code - how the point of unit testing isn't just to have lots of unit tests, how it's more that it encourages writing testable code, and that the point of having testable code means that your codebase is then easier to change quickly.

I even showed a simple demonstration based off of four boolean parameters and some simple business logic, showing that if it were one function, you'd have to write 16 tests to test it exhaustively, but if you refactored and used mocking, you'd only have to write 12. That surprised people. Through that we reinforced some simple guidelines of how we'd like to separate our code, focusing on pure functions when possible, making layers mockable. We don't even have a need for a complicated dependency injection framework as long as we reduce the # of dependencies per layer.

Since that time we've separated our test suite into integration tests and unit tests, with instructions to rewrite integration tests to unit tests if possible. (Some integration tests are worthwhile, but most were just because unit tests were hard at that time.) We turned parallelism back on for the unit test suite. The unit tests aren't flaky, and now people are running the unit test suite in an infinite loop in their IDE. Over that time our codebase has gotten better structured, we have less interdependence and merge conflicts, morale has improved, velocity has gone up.

Anyway, according to this article it sounds like we've done basically the opposite of what we should have done.

Yhippa · on July 10, 2020

Do you have a link to your presentation?

tunesmith · on July 14, 2020

Sorry, nothing that's so good for the general public, but the general gist is that the goal for a test is something that is simultaneously small, fast, and reliable.

And that by following those three principles, it kind of drives you to writing testable code. Because if you don't, you might have tests that are only small (simple integration tests), or only fast and reliable (testing unfactored code with lots of mocking) - and that the only way to do all three is by refactoring to write testable code that has good layer separation and therefore minimal mocking requirements.

There was stuff in there about how mutable state and concurrency leads to non-determinism and therefore unreliable tests, which is part of what justifies pushing towards pure functions that can be easily unit tested without mocking.

senorjazz · on July 9, 2020

> because I spend half of my time writing tests.

Only half your time? You're doing testing wrong if it doesn't take 80% of the time ;-)

I have a love hate relationship with testing. Working for myself as a company of one, some of the benefits testing bring just don't apply. I have a suite of programs built in the style of your point (1). The programs were quick to market and hacked out whilst savings ran out not knowing if I would make a single sale.

Sales came, customer requests came, new features were wanted, sales were promised "if the program could just do xyz". More things was hacked on. The promise of "I will go back and do this properly and tidy up this god unholy mess of code" slowly slipped away that I stopped lying to myself I would do it.

Yes there was a phase of fix one problem add another, but I have most of that in my head now and has been a long time since that happened.

Not a single test. Developing the programs was "fun" and exciting. Getting requests for features in the morning and having the build ready by lunch kept customers happy.

Now I am redoing the apps as a web app for "reasons". This time am doing it properly, testing from the start. I know exactly what the program should do and how to do it, unlike the first time when I really had no idea. But still, I Come to a point and realise the design is wrong and I hadn't taking something into consideration. Changing the code isn't so bad, changing the tests, O.M.G.

I am so fed up of the project, I do all I can to avoid it, it is 2 years late, I wish I never started it. The codebase has excellent testing, mocks, no little hacks, engineering wise am proud of it. The tests have found little edge cases that would have been found out by customers so avoided that. But there is no fun in it. No excitement. Is just a constant drudging slog.

Am trying to avoid dismissing testing all together, as I really want to see the benefit of it in a production substantially code base. If I ever get there. At the moment, the code base is the best tested unused software ever written IMO

moreaccountspls · on July 9, 2020

Well, then stop! Delete all the tests right now and do it however you want to do it.

The thing about testing that never really gets talked about it is, what's the penalty for regressions? What's the consequences if you ship a bug so bad the whole system stops working?

Well, if you're building a thing that's doing hundreds of millions in revenue, that might be a big deal. But you? You're a team of one! You rollback that bad deploy and basically no one cares!

Your customers certainly don't care if you ship bugs. If it was something important enough where they REALLY cared, they wouldn't be using a company of one person.

So, go for it. Dismiss tests until you get to a point where you fear deploying because of the consequences. Then add the bare minimum of e2e tests you need to get rid of that fear, and keep shipping.

bleah1000 · on July 9, 2020

There is another cost, if you try and fix a bug and break something else. If your codebase becomes so brittle that you feel like you can't do anything without breaking something else, that makes it unbearable to keep going with that project.

Having said all that, I find that it's better to avoid doing some unit tests when building your own project. It can be better to do the high level tests (some integration, focused on system) to make sure the major functionality works. In many cases, for an app that's not too complicated, you can just have a rough manual test plan. Then move to automated tests later on if the app gets popular, or the manual testing becomes too cumbersome.

It's still good to have a few unit tests for some tricky functions that do complicated things so you aren't spending hours debugging a simple typo.

moreaccountspls · on July 9, 2020

Sure. My point wasn't really whether to write unit tests or not. It's more, do what works for you / your team to enable you to ship consistently. For the OP, spending all of their time writing tests clearly isn't working for them if they haven't shipped at all.

monksy · on July 9, 2020

> Well, if you're building a thing that's doing hundreds of millions in revenue, that might be a big deal. But you? You're a team of one! You rollback that bad deploy and basically no one cares!

Human lives, customer faith in product, GDPR violations, HPPA violations, data, time/resources in space missions

https://medium.com/@ryancohane/financial-cost-of-software-bu...

https://en.wikipedia.org/wiki/Mars_Climate_Orbiter

mercer · on July 9, 2020

> But you? You're a team of one! You rollback that bad deploy and basically no one cares!

I somehow doubt that comparing this 'team of one project' to the Mars Climate Orbiter leads to any useful conclusions. It's a nice bit of hyperbole though!

monksy · on July 9, 2020

Rollbacks can create data loss. Also, rollbacks are not always a viable option.

Anyways..this was to address the issue of a bug. I took the comment of "it's just a team of one" as a way of trying to justify not putting your engineering due diligence into delivering a product to the customer.

mercer · on July 10, 2020

> Rollbacks can create data loss. Also, rollbacks are not always a viable option.

I've delivered a number of products (in the early days of my career) to clients where data loss happened and while not fun, it also didn't significantly harm the product or piss off said client. I saw my responsibility primarily to do the best I could and clearly communicate potential risks to the client.

> I took the comment of "it's just a team of one" as a way of trying to justify not putting your engineering due diligence into delivering a product to the customer.

That I do agree with, but 'due diligence' is a very vague concept. I guess honest communication about the consequence of various choices is perhaps the core aspect?

And of course 'engineering due diligence', in my opinion, includes making choices that might lead to an inferior result from a 'purely' engineering perspective.

moreaccountspls · on July 9, 2020

> not putting your engineering due diligence into delivering a product to the customer.

Yes. This is exactly what this person should do. Stop worrying about arbitrary rules and just deliver the damn product already. A hacky, shitty, unfinished product in your customer's hands that can be iterated on beats one that never got shipped at all every day of the week.

moreaccountspls · on July 9, 2020

I don't understand your point. We're specifically talking about a one person company.

claudiusd · on July 9, 2020

LOL. I guess I was being a bit conservative with that estimate!

I've worked for myself as well and know what you mean. In my situation, I was able to save myself from testing by telling my customers "this is a prototype so expect some issues".

commandlinefan · on July 9, 2020

My observation around codebases that weren't written with/for unit tests is that they always end up being a monolith that you have to run all of in order to run any of. Having decent code coverage means that it's at least possible to run just that one function that fails on the second Tuesday of the month when that one customer in Albania logs in.

heyoo · on July 9, 2020

Your points are fine, but I do not see how they apply to the blog post.

Overall, the blog post says, unit tests take a long time to write compared to the value they bring - instead (or also) focus on more valuable automated integration tests / e2e tests because it is much easier than it was 10-20 years ago.

claudiusd · on July 9, 2020

My point is that OP is in step 1 of 5. It's not to say there aren't any good thoughts there, but the overall diatribe comes from a place of inexperience so take their advice with a grain of salt.

heyoo · on July 9, 2020

I don't think OP is step 1. OP is not arguing against testing, although the title could lead one into thinking that. OP is arguing for better, more reasonable testing.

indigo945 · on July 9, 2020

OP appears to be arguing what you call step 5 of 5. They're not even saying you should never unit test, only that it should be avoided where it doesn't make sense, and that this happens more often than step-3 people like to think. Furthermore, the main direction of the article is that it's arguing for integration testing as a viable replacement for unit testing in a lot of situations, which doesn't relate to your overall point at all.

senorjazz · on July 9, 2020

the comment is relatable to a testing mindset progression, which is relevant.

Your comment on the other hand, less so...

adrianmonk · on July 9, 2020

Step 5 touches on what I like to call "engineering judgment".

One of the things that distinguishes great engineers is that they make good judgment calls about how to apply technology or which direction to proceed. They understand pragmatism and balance. They understand not to get infatuated with new technologies but not to close their minds to them either. They understand not to dogmatically apply rules and best practices but not to undervalue them either. They understand the context of their decisions, for example sometimes code quality is more important and other times getting it built and shipped is more important.

As in life, good and bad decisions can be the key determiner of where you end up. You can employ a department full of skilled coders and make a few wrong decisions and your project could still end up a failure.

Some people never develop good engineering judgment. They always see questions as black and white, or they can't let go of chasing silver bullet solutions, etc.

Anyway, it's one thing to understand how to do unit tests. It's another thing to understand why you'd use them, what you can and can't get out of them, what the costs are, and take into account all that to make good decisions about how/where to use them.

jwr · on July 9, 2020

This. These days I write unit tests only for functions whose mechanism is not immediately clear. The tests serve as documentation, specification of corner cases, and assurance for me that the mechanism does what it was intended to do.

I keep tests together with the code, because of their documentation/specification value.

I do not write tests for functions which are compositions of library functions. I do not test pre/post-conditions (these are something different).

And I definitely do not try to have "100% test coverage".

GuB-42 · on July 9, 2020

Spot on.

Personally, I fast tracked through 2-4 out of sheer laziness but that's definitely my progression in regards to testing and pretty much everything related to code quality. It includes comments, abstraction, purity, etc...

More generally:

- Initially, you are victim of the Dunning–Kruger effect, standing proudly on top of Mount Stupid. You think you can do better than the pros by not wasting time on "useless stuff".

- Obviously, that's a fail. You realize the pros may have a good reason for working the way they do. So you start reading books (or whatever your favorite learning material is), and blindly follow what's written. It fixes your problems and replace them with other problems.

- After another round of failure, you start to understand the reasoning behind the things written in the books. Now, you know to apply them when they are relevant, and become a pro yourself.

koonsolo · on July 9, 2020

Sounds like my life story ;).

One thing I do religiously all the time is putting asserts everywhere. It's the only thing you can go crazy on. The rest is indeed always a balancing act.

tgv · on July 9, 2020

Hear, hear.