Hacker Newsnew | past | comments | ask | show | jobs | submit | felixfbecker's commentslogin

MCP doesn't force models to output JSON, quite the opposite. Tool call results in MCP are text, images, audio — the things models naturally output. The whole point of MCP is to make APIs digestable to LLMs


I think perhaps they're more referring to the tool descriptions...not sure why they said output


An SVG doesn't need to support scripting. When you load an SVG through an <img> tag for example, no <script>s run either (only if you use <iframe>, <object>, or inline in HTML5). When you serve the SVG (or the HTML it is inlined in) with a CSP that doesn't allow inline scripts, no scripts run. It's totally possible to render an SVG without scripts (most SVGs do not contain scripts) and various mechanisms for this are already implemented in browsers.


>An SVG doesn't need to support scripting.

No shit? I bet that's what I meant when I said "SVG inline needs to support scripting" then?

>It's totally possible to render an SVG without scripts (most SVGs do not contain scripts) and various mechanisms for this are already implemented in browsers.

Yes it is totally possible to render an SVG without scripts, and it is also possible to render them with, hence when I say something like "if Safari's SVG implementation meant that SVG favicons were open to either XML exploits or scripting exploits" that IF is a real important indicator that hey, if they did it as an inline SVG but now it is sitting inside the browser chrome with heightened permissions it would be a problem, furthermore, the XML exploits available in the browser chrome might also be more deadly.

But why would they do this? Hey I don't know, I have noticed that sometimes people do dumb things, including browser developers, or they don't catch edge cases because they don't realize them.

I also noticed that one of the comments as to what had been implemented was support for SVG favicon as a data uri, if an SVG favicon was implemented in this way it might very well be the edge case that the data uri exists as an "inline" image. Seems unlikely because data uri should normally be in an img tag, but I have also experienced some unlikely or unexpected things with data uris before so I would think it a possible place for things to go wrong.


OpenAI and Anthropic respect robots.txt afaik


To add anecdotally based on logging on my portfolio site, all major US players (OpenAI, Google, Anthropic, Meta, CommonCrawl) appeared to respect robots.txt as they claim to do (can't say the same of Alibaba).

Sometimes I do still get requests with their useragents, but generally from implausible IPs (residential IPs, or "Google-Extended" from an AWS range, or same IP claiming to be multiple different bots, ...) - never from the bots' actual published IP addresses (which I did see before adding robots.txt) - which makes me believe it's some third party either intentionally trolling or using the larger players as cover for their own bots.


Using residential IPs is standard operating procedure for companies that rely on collecting information via web scraping. You can rent residential egress IPs. Sometimes this is done in a (kind of) legit way by companies that actually subscribe to residential ISPs. Mostly it's done by malware hijacking consumer devices.


Noooooope! They completely ignore crawl frequency in my experience. Bing too. Only Google seems to obey it.


They dont.


Microsoft contributes a lot of web standard implementations upstream to Chromium. They are not just letting Google do all the work as your comment makes it sound like. They could have chosen to do the same with Firefox, which means the reason to fork Chromium and not Firefox had other reasons.


Right, I use a browser extension that automatically declines consent.

On my personal homepage I also use anonymous, privacy-preserving, GDPR-compliant analytics that doesn't require prompts. Other websites made a choice.


Or could be a browser preference that then send an HTTP header ? Wait https://developer.mozilla.org/fr/docs/Web/HTTP/Headers/DNT


The problem with DNT was that there was no established legal basis governing its meaning and some browsers just sent it by default so corporations started arguing it's meaningless because there's no way to tell if it indicates a genuine request or is merely an artefact of the user's browser choice (which may be meaningless as well if they didn't get to choose their browser).

As the English version of that page says, it's been superceded by GPC which has more widespread industry support and is trying to get legal adoption though I'm seeing conflicting statements about whether it has any legal meaning at the moment, especially outside the US - the described effects in the EU seem redundant given what the GDPR and ePrivacy directive establish as the default behavior: https://privacycg.github.io/gpc-spec/explainer


That's basically how goroutines work in Go. You opt into concurrency with the `go` keyword, while it's blocking by default. While in JS it's concurrent by default and you opt-in to blocking with the `await` keyword. (Except in Go you have true parallelism for CPU-bound tasks too, while in JS it's only for I/O)

Both have their pros and cons. I've seen problems in Go codebases where some I/O operation blocks the main thread because it's not obvious through the stack that something _should_ best be run concurrently and it's easy to ignore until it gets worse (at which point it's annoying to debug).


Any public information eventually is priced into the stock price by the market.

Say you buy the stock even though you didn't read the DEI statement, but other people bought the stock before you having read it. Their purchases drove up the stock price, so you had to pay more for the stock. You got defrauded of the delta. Especially if now it comes out, the price goes down, and you make losses.


I think the kind of teams that always stay on top of the latest TypeScript version and use the latest language features are also more likely to always stay on top of the latest Node versions. In my experience TypeScript upgrades actually more often need migrations/fixes for new errors than Node upgrades. Teams that don't care about latest V8 and Node features and always stay on LTS probably also care less about the latest and greatest TypeScript features.


I work on a large app that’s both client & server typescript called Notion.

We find Typescript much easier to upgrade than Node. New Node versions change performance characteristics of the app at runtime, and sometimes regress complex features like async hooks or have memory leaks. We tend to have multi-week rollout plans for new Node versions with side-by-side deploys to check metrics.

Typescript on the other hand someone can upgrade in a single PR, and once you get the types to check, you’re done and you merge. We just got to the latest TS version last week.


This is true, but in other cases they added keywords in ways that could work with type stripping. For example, the `as` keyword for casts has existed for a long time, and type stripping could strip everything after the `as` keyword with a minimal grammar.

When TypeScript added const declarations, they added it as `as const` so a type stripping could have still worked depending on how loosely it is implemented.

I think there is a world where type stripping exists (which the TS team has been in favor of) and the TS team might consider how it affects type stripping in future language design. For example, the `satisfies` keyword could have also been added by piggy-backing on the `as` keyword, like:

    const foo = { bar: 1 } as subtype of Foo
(I think not using `as` is a better fit semantically but this could be a trade-off to make for better type stripping backwards compatibility)


I don't know a lot about parser theory, and would love to learn more about ways to make parsing resilient in cases like this one. Simple cases like "ignore rest of line" make sense to me, but I'm unsure about "adversarial" examples (in the sense that they are meant to beat simple heuristics). Would you mind explaining how e.g. your `as` stripping could work for one specific adversarial example?

    function foo<T>() {
        return bar(
            null as unknown as T extends boolean
            ? true /* ): */
            : (T extends string
                ? "string"
                : false
            )
            )
    }

    function bar(value: any): void {}
Any solution I can come up with suffers from at least one of these issues:

- "ignore rest of line" will either fail or lead to incorrect results - "find matching parenthesis" would have to parse comments inside types (probably doable, but could break with future TS additions) - "try finding end of non-JS code" will inevitably trip up in some situations, and can get very expensive

I'd love a rough outline or links/pointers, if you can find the time!

[0] TS Playground link: https://www.typescriptlang.org/play/?#code/AQ4MwVwOwYwFwJYHs...


Most parsers don't actually work with "lines" as a unit, those are for user-formatting. Generally the sort of building blocks you are looking for are more along the lines of "until end of expression" or "until end of statement". What defines an "expression" or a "statement" can be very complex depending on the parser and the language you are trying to parse.

In JS, because it is a fun example, "end of statement" is defined in large part by Automatic Semicolon Insertion (ASI), whether or not semicolons even exist in the source input. (Even if you use semicolons regularly in JS, JS will still insert its own semicolons. Semicolons don't protect you from ASI.) ASI is also a useful example because it is an ancient example of a language design intentionally trying to be resilient. Some older JS parsers even would ignore bad statements and continue on the next statement based on ASI determined statement break. We generally like our JS to be much more strict than that today, but early JS was originally built to be a resilient language in some interesting ways.

One place to dive into that directly (in the middle of a deeper context of JS parser theory): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


Thanks for the response, but I'm aware of the basics. My question is pointed towards making language parsers resilient towards separately-evolving standards. How would you build a JS parser so that it correctly parses any new TS syntax, without changing behavior of valid code?

The example snippet I added is designed to violate the rules I could come up with. I'd specifically like to know: what are better rules to solve this specific case?


> How would you build a JS parser so that it correctly parses any new TS syntax, without changing behavior of valid code?

I don't know anything about parsers besides what I learned from that one semester worth of introduction class I took in college but from what I understand of your question, I think the answer is you can't simply because we can't look into the future.


In your specific case:

1. Automatic semicolon insertion would next want to kick in at the } token, so that's the obvious end of the statement. If you've asked it to ignore from `as` to the end of the statement (as you've established with your "ignore to the end of the 'line'"), that's where it stops ignoring.

1A. Obviously in that case `bar(null` is not a valid statement after ignoring from `as` to the end of the statement.

2. The trick to your specific case, that you've stumbled into is that `as` is an expression modifier, not a statement modifier. The argument to a function is an expression, not a statement. That definitely complicates things because "end of the current expression" is often a lot more complicated than ASI (and people think ASI is complicated). Most parsers are going to have some sort of token state counter for nested parentheses (this is a fun implementation detail of different parsers because while recursion is easy enough in "context-free grammars" the details of tracking that recursion is generally not technically "context-free" at that point, so sometimes it is in the tokenizer, sometimes it is a context extension to the parser itself, sometimes it is using a stack implementation detail of the parser) and you are going to want to ignore to the next "," token that signals a new argument or the next ")" that signals the end of arguments, with respect to any () nesting.

2A. Because of how complicated expression parsing can get, that probably sets some resiliency bounds on your "ignorable grammar": it may require that internally it still follows most of the logic of your general expression language: balanced nested parentheses, no dangling commas, usual comment syntax, etc.

2B. You probably want to define those sorts of boundaries anyway. The easiest way is to say that ignorable extensions such as `as` must themselves parse as if it was a valid expression, even if the language cannot interpret its meaning. You can think of this as the meta-grammar where one option for an expression might be `<expression> ::= <expression> 'as' <expression>` with the second expression being parseable but ignorable after parsing to the language runtime and JIT. You can see that effectively in the syntax description for Python's original PEP 3107 syntax-only type hints standard [1], it's surprisingly that succinct there. (The possible proposed grammar in the Type Annotations proposal to TC39 is a lot more specific and a lot less succinct [2], for a number of reasons.)

[1] https://peps.python.org/pep-3107/

[2] https://tc39.es/proposal-type-annotations/grammar.html


CSS syntax have specific rules for how to handle unexpected tokens. E.g if an unexpected character is encountered in a declaration the parser ignores characters until next ; or }. But CSS does not have arbitrary nesting, so this makes it easier.

Comments as in your example is typically stripped in the tokenization stage so would not affect parsing. The TpeScript type syntax has its own grammar, but it uses the same lexical syntax as regular JavaScript.

A “meta grammar” for type expressions could say skip until next comma or semicolon, and it could recognize parentheses and brackets as nesting and fully skip such blocks also.

The problem with the ‘satisfies’ keyword is a parser without support would not even know this is part of the type language. New ‘skippable’ syntax would have to be introduced as ‘as satisfies’ or similar, triggering the type-syntax parsing mode.


I understand that you can define a restricted grammar that will stay parseable, as the embedded language would have to adapt to those rules. But that doesn't solve the question, as Typescript already has existing rules which overlap with JS syntax. The GP comment was:

> For example, the `as` keyword for casts has existed for a long time, and type stripping could strip everything after the `as` keyword with a minimal grammar.

My question is: what would a grammar like this look like in this specific case?


How about:

    TypeAssertion ::= Expression “as” TypeStuff
    TypeStuff ::= TypeStuffItem+
    TypeStuffItem ::= Block | any     token except , ; ) } ]
    Block ::= ParenBlock | CurlyBracketsBlock | SquareBracketsBlock | AngleBracketsBlock
    ParenBlock ::= ( ParenBlockItem* )
    ParenBlockItem ::= Block | any token except ( )
etc.


It can’t strip what’s after the as keyword without an up-to-date TS grammar, because `as` is an expression. The parser needs to know how to parse type expressions in order to know when the RHS of the `as` expression ends.

Let’s say that typescript adds a new type operator “wobble T”. What does this desugar to?

    x as wobble
    T
Without knowing about the new wobble syntax this would be parsed as `x as wobble; T` and desugar to `x; T`

With the new wobble syntax it would be parsed as `x as (wobble T);` according to JS semicolon insertion rules because the expression wobble is incomplete, and desugar to `x`


The “as” expression is not valid JavaScript anyway, so the default rule for implicit semicolon does not apply. A grammer for type expressions could define if and how semicolons should be inserted.


TypeScript already has such type operators though. For example:

    type T = keyof
    {
      a: null,
      b: null
    }
Here T is “a”|”b”, no automatic semicolon is inserted after `keyof`. While I don’t personally write code like this, I’m sure that someone does. It’s perfectly within the rules, after all.

While it’s true that TS doesn’t have to follow JS rules for semicolon insertion in type expressions, it always has done, and probably always should do.


This is just the default. Automatic semicolon insertion only happen in specific well-defined cases, for example after the “return” keyword or when an invalid expression can be made valid by semicolon insertion. Neither applies here.


> In SVG, all shapes are absolutely positioned. Text does not wrap automatically.

This is not really true — you can position elements inside the SVG coordinate system using percentages and you can mix absolute coordinates and percentages. This allows you to have elements grow and shrink in reaction to width and height without distortion.

Wrapping text is possible with <foreignObject>, simply let HTML/CSS do the text layout wherever you need text within the SVG.

However it is still true that you usually want to do a bunch of calculations in JS based on the width to know how many chart ticks you want, how many labels, etc. But that is pretty easy to compute with the helpers from D3.


Hello! Author here. When I say absolute positioning, I mean it in the CSS sense (as in position: absolute) to contrast it with CSS Normal Flow, Flexbox, Grid, Floats etc.

While it's true that SVG allows percent positioning and foreignObject with HTML inside, this does not help us for the task at hand: Positioning shapes/text in relation to other shapes/text. This is out of scope for SVG to my knowledge, while it's the natural domain of CSS.

Almost all of our charts have text labels of different size. In many charts, we measure the text with measureText() on a canvas first, then compute the SVG positions manually. Or we have HTML/SVG hybrids where we measure the text using getBoundingClientRect(). This two-stage measure/layout process is what we seek to reduce.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: