I've decided to to try out Pyrefly, Ty and Zuban for their language server features (type checking disabled where possible) last week and found that Zuban is the fastest (but unfortunately doesn't currently have the option to disable type checking). Ty comes next and Pyrefly was surprisingly slow to load (have to wait a few seconds before I can goto definition for example).
They all lack certain features vs basedpyright (what I was using) such as auto imports (Ty has experimental support), showing signature/doc when selecting autocomplete options (I think Ty does have this one) and some other features that I can't remember.
One feature that always existed in Jedi (and now also Zuban) is "goto declaration" in Python. It allowed me to goto the "import" instead of the original definition of a function/class which I'm surprised either isn't supported at all (pyright?) or would just do the same thing as "goto definition" (Ty), seems like an obvious oversight imho.
Edit: Also, I wish all these new tools give more importance to such IDE features as much as they do for type checking.
Hey, Pyrefly developer here, thanks for trying us out!
We're dedicated to providing a great IDE experience, though it does take some time. Please bother us on github / our discord if you have features you want - bug reports / community asks are our biggest priority.
- auto import is implemented in Pyrefly: it uses your pyrefly.toml project structure or falls back to your VSCode workspace (up to the first 2500 files). we're happy to fix it for your situation if you want to provide a reproduction
- speed: by far the biggest issue. your problem is likely related to [this](https://github.com/facebook/pyrefly/issues/360) but we need more information to speed it up. we're happy to work with you to to improve this if you're willing to provide a project structure
Hi, thanks for the links and for opening the goto declaration issue (that's awesome). I will try to dedicate more time into my setup and provide feedback where possible. Looking forward for all the new updates!
If ruff & uv have proved anything, it's that a tool that's effortless, a net-positive and fast will get adopted.
New typecheckers don't need to be perfect. They need to be good enough, easy to integrate and have low false positives. Sure, they will improve with time, but if feels like a pain then no one will pick it up. Python users are famously averse to tools that slow down their dev cycles, even if it means better long term stability
BasedPyright is popular because it comes built-in with Cursor and disappears into the background. I have a positive bias towards Astral figuring it out given their track record. But, none of these tools have reached the point of effortlessness just yet.
> New typecheckers don't need to be perfect. They need to be good enough, easy to integrate and have low false positives. Sure, they will improve with time, but if feels like a pain then no one will pick it up. Python users are famously averse to tools that slow down their dev cycles, even if it means better long term stability
I really don't agree. Sure, they don't need to be perfect but also keep in mind many codebases have already standardized on something like (based)pyright or mypy. So there's a migration cost. If your analysis has a lot of false positives or misses a lot of what those type checkers miss then there's little incentive to migrate. Sure, ty and pyrefly are much faster, but at the end of the day speed is only one consideration for a type checker.
Think of it this way. There are 2 groups. Group 1 has has avoided typecheckers because they're a PITA and group 2 has configured mypy/pyright despite the devx pain. Group 1 is a lot bigger than group 2. Group 2 is more lucrative per unit than group 1.
With enough time, ty and pyrefly will approach perfectness. If they're easy enough to use today, group 1 should be able to adopt them without any extra pain. (some typechecker is better than no typechecker). This gives them momentum. In couple of years, ty/pyrefly may finally be better than mypy/pyright. Then, Group 2 can start their ports.
This way, no one misses out. Group 2 still gets their perfect typechecker, just not immediately. But in that time, Group 1 is getting familiar with using typecheckers and their sheer size helps build institutional momentum towards typecheckers as an essential piece of any python dev flow.
If A. 'certain class of python problems are permanently solved by typecheckers' and B. 'every python user has some typechecker' become true, then that opens a lot of doors. Today, B is a harder problem than A. I'm guessing that compiled/JIT python will be the next frontier once python typing is solved. Wide typechecker adoption is a blocking requirement for that door to be opened.
No, I'm pretty sure they need to be perfect. If the tool's telling me bad information about types, first I'm going to lose a stupid amount of time debugging to that wrong info, and then swear off the tool forever after I realize it was the tool that was wrong.
Type checking in Python involves guessing. Take a program like
x = [3]
Should the type checker guess that the type of x is list[Any], or list[int], or list[Literal[3]], or something else? Libraries you depend on will pose more difficult versions of this question.
So it's not possible to expect the tool to be "perfect", whatever that means - usually people think of it as meaning it allows all code that they think is idiomatic and reasonable, and disallows all code that could raise a TypeError at runtime. You can only expect the tool to be reasonably good and perhaps to have an opinionated design that matches the way you would like to write Python.
Arguably it's the job of a linter and not a type checker, but if your code just assigns random things to random variables that never get used again, I hope something would point that out to you before it passes code review and merged to main,
ideally before it even gets sent for review.
Ignoring the question of why there's some random assignment to x in the first place, where are its type hints? Those were added starting with Python 3.5 via PEP 484 just over 10 years ago and have been added to since then. If our goal is maintainable code at scale, the first thing I'd expect of a type checker is for it to communicate with the user, stating that a) the variable in file foo.py on line 43 is missing a type hint and couldn't be guessed with confidence, and that it's just gonna guess it's a list[int]. For bonus points, the tool could run in interactive mode and ask the user directly. But even without a hint, assuming the variable actually is used later, how does it get used?
It seems like you'd just start with the most strict interpretation, list[Literal[3]], and relax it as it gets used. If it gets fed into a function that doesn't have hints and can't be introspected for whatever reason, relax it to list[int] and then further relax it to list[Any] as necessary. Then depending on the mode the user configured the tool to run in, print nothing, or a warning,
or error out or ask the user what to do if running interactively. Some of the config options could be strict, tolerant, and loose. Or maybe enforcing, permissive, and loose. Whatever color we want this bike shed to be.
More advanced tools let the user configure individual issues and their corresponding levels, but that may be considered too many options to give the user, overwhelming the user who then doesn't use the tool.
As far as perfect typing goes, I mean, yeah, ideally, after satisfying the type checker in strict mode, which means no types need to be guessed at, as all variables and other locations without type annotations that need them were reported, or even annotated interactively, or if we're real fancy, automatically with a comment. IMO programs should feel free to edit code directly (as a non-default option) That would mean the program could not throw an an uncaught TypeError, unexpectedly. That's a lot to ask of a dynamically strongly typed language, but I'd settle for it never being utterly wrong.
What that means is if the type checker says the function is returning a string, but of the 200 calls to that function, the type checker throws an error and says ten of those callsites expect it to return a dict, all I'm saying is this hypothetical type checker had better be right about that. Nothing worse than losing hours rooting around in the code looking for something that isn't actually there (or an errant semicolon, but this isn't C)
> Ignoring the question of why there's some random assignment to x in the first place, where are its type hints? Those were added starting with Python 3.5 via PEP 484 just over 10 years ago and have been added to since then.
The code is in a third party library which doesn't have comprehensive type hints, and perhaps has type expectations so complicated that they can't be expressed in the Python type system.
Even if you enforce type hints on every line in your internal code, you're going to be relying at some point on libraries which are reliable but poorly type hinted. If you're not using that vast ecosystem of libraries, Python probably wasn't the best choice.
Also, in the past, many tools were dumped by users for too many false positives. Tools like Astree and RV-Match got adoption by having no or low false positives.
I don't think that's an accurate metaphor. Seatbelts are expressly a runtime solution to car crashes, whereas Python's type checking is a) only done during development (not even during build), and b) completely reliant on third-party tools.
If you're looking for a car analogy, I would suggest comparing Python type checking to installing speed cameras on the factory floor.
This sounds like the typical "its good enough" argument. Say we test the breaks manually nine times of of ten. This means we get really hard to diagnose bugs because "this did work when i tested locally". And users get more runtime errors because of half-assed development.
Type systems should be treated like maths, its either correct, or not. In the end, a typesystem is basically just that, its math behind the scenes, more specifically a genre of maths called category theory.
At least for Ty, not sure about the others, it is explicitly for type checking, exposed as an LSP. It’s not trying to compete with full LSP implementations. Most modern editors let you combine multiple LSPs, you shouldn’t think of them as using only one at a time
We actually do want ty to be a first-class LSP (i.e., a complete alternative to Pylance and others), and it already supports nearly all of the features you'd expect. I use it as my primary LSP today in lieu of Pylance!
I understand your confusion because at first look it seems like it's just a type checker (same issue with pyrefly and zuban) hence my comment about bringing forward the fact it's a fully featured lsp
They all lack certain features vs basedpyright (what I was using) such as auto imports (Ty has experimental support), showing signature/doc when selecting autocomplete options (I think Ty does have this one) and some other features that I can't remember.
One feature that always existed in Jedi (and now also Zuban) is "goto declaration" in Python. It allowed me to goto the "import" instead of the original definition of a function/class which I'm surprised either isn't supported at all (pyright?) or would just do the same thing as "goto definition" (Ty), seems like an obvious oversight imho.
Edit: Also, I wish all these new tools give more importance to such IDE features as much as they do for type checking.