Hacker Newsnew | past | comments | ask | show | jobs | submit | beart's commentslogin

The switch from plan mode to build is not always clearly defined. On a number of occasions, I've been in plan mode and enter a secondary follow up prompt to modify the plan. However, instead of updating the plan, the follow up text is taken as approval to build and it automatically switches to building.

Ask mode, on the other hand, has always explicitly indicated that I need to switch out of ask mode to perform any actions.

This is my experience with Cursor CLI.


The first time I recall encountering this sort of feature was in one of the early sim city games. I wonder if this being a feature of Claude indicates the humanity of some engineer behind it, or if it is a deliberate effort to apply humanity to the agent.

In fact ‘Reticulating splines’ from simcity 2000s load screen is one they use.

The Windows 8 equivalent server edition also included the upgrade to Metro UI. I don't know, I guess MS figured IT wanted to provision Windows services using a surface tablet?

I actually really did like Windows Phones though. I can imagine a world with a third competitor in that space today... But MS didn't seem to have any understanding or ability to develop an ecosystem that works. Even when they were literally paying people to write apps for their app store, it was just terrible.


UUIDs aren't random by design, and the structure is not pointless. Calling something you don't understand "stupid" is probably not a good approach to life.

One example where UUIDs are useful is usage as primary keys in databases. The constraints provide benefits, such as global uniqueness across distributed systems.


The global uniqueness of a uuid v4 is the global uniqueness of pulling 122 bits from a source of entropy. Structure has nothing to do with it, and pulling 128 bits from the same source is strictly (if not massively) superior at that.

I stand corrected. I was thinking of the sequential nature of uuid 7, or SQL servers sequential id.

How do you safely convert a 2 byte character to a 1 byte character?

Easily! If it doesn't convert successfully because it includes characters outside of the range of the target codepage then the equality condition is necessarily false, and the engine should short-circuit and return an empty set.

I agree with your first point. I've seen this same issue crop up in several other ORMs.

As to your second point. VARCHAR uses N + 2 bytes where as NVARCHAR uses N*2 + 2 bytes for storage (at least on SQL Server). The vast majority of character fields in databases I've worked with do not need to store unicode values.


> The vast majority of character fields in databases I've worked with do not need to store unicode values.

This has not been my experience at all. Exactly the opposite, in fact. ASCII is dead.


Vast majority of text fields I see are coded values that are perfectly fine using ascii, but I deal mostly with English language systems.

Text fields that users can type into directly especially multiline tend to need unicode but they are far fewer.


English has plenty of Unicode — claiming otherwise is such a cliché…

Unicode is a requirement everywhere human language is used, from Earth to the Boöotes Void.


I am talking about coded values, like Status = 'A', 'B' or 'C'

Taking double the space for this stuff is a waste of resources and nobody usually cares about extended characters here in English language systems at least they just want something more readable than integers when querying and debugging the data. End users will see longer descriptions joined from code tables or from app caches which can have unicode.


It's way better to just use a DBMS that supports enums. I know SQL server isn't one of those but I still don't store my coded values as strings.

How do you store them? Also enums are not user configurable normally. It would be a good feature to have them, but they don't work well in many cases.

Typical code tables with code, description and anything else needed for that value which the user can configure in the app.

Sure you can use integers instead of codes, now all your results look like 1, 2, 3, 4 for all your coded columns when trying to debug or write ad-hoc stuff. Also ints are not variable length so your wasting space for short codes and you have to know ahead time if its only going to be 1,2,4 or 8 bytes.


Enums are for non user-configurable values.

For configurable values, obviously you use a table. But those should have an auto-integer primary key and if you need the description, join for it.

Ints are by far more the efficient way to store and query these values -- the length of the string is stored as an int and variable length values really complicate storage and access. If you think strings save space or time that is not right.


>Enums are for non user-configurable values

In the systems I work with most coded values are user configurable.

>But those should have an auto-integer primary key and if you need the description, join for it.

Not ergonomic now when querying data or debugging things like postal state are 11 instead of 'NY'

select * from addresses where state = 11, no thanks.

Your whole results set becomes a bunch of ints that can be easily transposed causing silly errors. Of course I have seen systems that use guids to avoid collision, boy is that fun, just use varchar or char if your penny pinching and ok with fixed sizes.

>the length of the string is stored as an int

No it's stored as a smallint 2 bytes. So a single character code is 3 bytes rather than a 4 byte int. 2 chars is the same as an int. They do not complicate storage access in any meaningful way.

You could use smallint or tinyint for your primary key and I could use char(2) and char(1) and get readable codes if I wanted to really save space.


> They do not complicate storage access in any meaningful way.

Sure they do, because now your row / index is variable length rather than fixed length. Way more complicated. Even 3 bytes is way more complicated to deal with than 4 bytes.

> select * from addresses where state = 11, no thanks.

I will agree that isn't fun. Is it still the trade off I do make? Absolutely. And it's not really that big of a problem; I just do a join. It also helps prevent people from using codes instead of querying the database for the correct value -- what's the point of user-configuration of someone hard-codes 'NY' in a query or in the code.


>Sure they do, because now your row / index is variable length rather than fixed length. Way more complicated.

Come on its literally a 2 byte per column header in the row so it just sums the column lengths to get the offset, it does the same thing for fixed length except it gets the col length from the schema.

It's not much more complicated than a fixed length column only the column length is stored in row vs schema. I am not sure where you are getting this idea it way more complicated, nor the 3 vs 4 byte thing, the whole row is a variable length structure and designed as such, null values change the row length fixed or variable data type and have to be accounted for since a null takes up no space in the column data its only in the null bitmap.

> what's the point of user-configuration of someone hard-codes 'NY' in a query or in the code

Because it doesn't matter, 'NY' isn't changing just like 11 the int wouldn't change, but 'NY' is way easier to understand and catch mistakes with and search for code without hitting a bunch of nonsense and distinguish when 10 columns are all coded next to each other in a result set.

I prefer my rows to be a little more readable than 1234, 1, 11, 2, 15, 1 ,3 and the users do too.

I have had my fill of transposition bugs where someone accidentally uses the wrong int on a pk id from a different table and still gets a valid but random result that passes a foreign key check almost enough for me to want to use guid's for pk's almost. At least with the coded values it is easier to spot because even with single character code people tend to pick things that make sense for the column values you know 'P' for pending, 'C' for complete etc, vs 1 2 3 4 used over and over across every different column with an auto increment.


> Come on its literally...

You're the one saying a 2 character string is somehow a space savings. If we're going to split hairs that finely then you have to know that any row with a variable length string makes the entire row/index variable length and that is a net storage and performance loss. It's worse in every way than a simple integer. I will admit that it ultimately doesn't matter. But I'd also argue using an nvarchar in place of varchar for this also doesn't matter. It's not just premature optimization it's practically useless optimization.

> Because it doesn't matter, 'NY' isn't changing just like 11 the int wouldn't change, but 'NY

That's not what happens but what happens is that somebody renames New York to New Eburacum and now your code doesn't match the value and it just adds more confusion.

But I'll grant you that it's totally fine. It's even more fine if you don't use varchar and instead use char(x).


>You're the one saying a 2 character string is somehow a space savings. If we're going to split hairs that finely then you have to know that any row with a variable length string makes the entire row/index variable length and that is a net storage and performance loss.

The row is always variable lengths as a structure it has flags noting how many columns there are with values and if there is a variable length section or not, only rows with no variable length fields at all has no variable length section and that is a bit flag check in the header.

You are making a non argument, variable length fields can be a space savings over an int with single char codes which is very common, and do not impact performance in any meaningful way. Besides that one could use fixed length chars and still get the other benefits I mentioned while having the same exact space usage and processing as a fixed length ints.

>That's not what happens but what happens is that somebody renames New York to New Eburacum

Changing the descriptive meaning of an entry causes all sorts of problems and even more so if it is a int because it's completely opaque its much harder to see an issue in the system because everything is a bunch of ints that do not correlate in any way to their meaning.

Changing the description to something that has the same meaning worded differently is usually not an issue and still gives good debug visibility to the value. If you and your users consider New Eburacum synonymous with New York, then having the code stay 'NY' should not be an issue and still be obvious when querying the data.

Unless someone is using the short code in a user visible way and it has to be updated. State is a common one that does this and nobody is changing state names or codes because it is a common settled natural key.

In the rare situation this actually needed to be done then one can update existing data, this is a not an issue in practice. You have the be extremely cautious updating the description of a code because much data was entered under the previous description and the meaning that it carries, having the code have some human meaning makes it more obvious to maintainers this should be done with care, many times it would involve deprecating the old one and making a new one with a different code because they have different meanings, having a table instead of a enum allows other columns to have this metadata.

This is not the same issue as say using a SSN for a person ID.


Please take literally one course.

Do NOT use mnemonics as primary keys. It WILL bite you.


https://en.wikipedia.org/wiki/Natural_key you should have learned learn this in your courses.

Clam down, I am not suggesting using this for actual domain entity keys, these are used in place of enums and have their advantages. I have doing this a long time and it has not bit me, I have also seen many other system designed this way as well working just fine.

Using an incrementing surrogate key for say postal state code serves no purpose other than making things harder to use and debug. Most systems have many code values such as this and using surrogate key would lead to a bunch of overlapping hard to distinguish int data that leads to all sorts of issues.


The way to do enums in SQL (generally, not just MSSQL) is another table. It's better that they don't offer several ways to do the same thing.

Mostly agree separate tables can have multiple attributes besides a text description and can be exposed for modification to the application easily so users or administrators can add and modify codes.

A common extra attribute for a coded value is something for deprecation / soft delete, so that it can be marked as no longer valid for future data but existing data can remain with that code, also date ranges its valid for etc, also parent child code relationships.

Enums would be a good feature but they have a much more limited use case for static values you know ahead of time that will have no other attributes and values cannot be removed even if never used or old data migrated to new values.

Common real world codes like US postal state can take advantage of there being agreed upon codes such as 'NY' and 'New York'.


While I generally would prefer lookup tables, it's much easier to sell dev teams on "it looks and acts like a string - you don't have to change anything."

Those are all single byte characters in UTF-8.

We are talking nvarchar here, yes UTF-8 solves this issue completely and MSSQL supports it now days with varchar.

But nvarchar is UTF-16

No. Look closer.

Just to be pedantic, those characters are in 'ANSI'/CP1252 and would be fine in a varchar on many systems.

Not that I disagree — Win32/C#/Java/etc have 16-bit characters, your entire system is already 'paying the price', so weird to get frugal here.


My comment contains two glyphs that are not in CP1252.

Also less awkward to make it right the first time, instead of explaining why someone can’t type their name or an emoji

Specifically not talking about a name field

> Unicode is a requirement everywhere human language is used

Strange then how it was not a requirement for many, many years.


Oh, it was. It was fun being unable to type a euro sign or the name Seán without it being garbled. Neither were matched quotation marks, and arguably computer limitations killed off naïve and café too.

Don’t confuse people groaning and putting up with limitations as justifying those limitations.


In Portugal it always was, that is why we got to use eh for é, ah for á, he for è, c, for ç and many other tricks.

Shared by other European languages, like ou for ö in German, kalimera for καλημέρα, and so on all around the world in non-English speaking countries during the early days of computing.


Or rather, computers had inadequate support.

It was a mess back then though. Unicode fixed that.

I'm not convinced that Unicode fixed anything. I was kind of hoping, way back when, that everyone would adopt ASCII, as a step to a more united world. But things seem to have got more differentiated, and made things much more difficult.

The options were never ASCII or unicode though. Before unicode we had ASCII + lots of different incompatible encodings that relied on metadata to be properly rendered. That's what unicode fixed

Besides I like being able to put things like →, €, ∞ or ∆ into text. With ascii a lot of things that are nowadays trivial would need markup languages


For whom? Certainly not any of the humans trying to use the computer.

Some examples of coded fields that may be known to be ascii: order name, department code, business title, cost center, location id, preferred language, account type…

To complicate matters SQL Server can do Nvarchar compression, but they should have just done UTF-8 long ago:

https://learn.microsoft.com/en-us/sql/relational-databases/d...

Also UTF-8 is actually just a varchar collation so you don't use nvarchar with that, lol?


Generally if it stores user input it needs to support Unicode. That said UTF-8 is probably a way better choice than UTF-16/UCS-2

UTF-8 is a relatively new thing in MSSQL and had lots of issues initially, I agree it's better and should have been implemented in the product long ago.

I have avoided it and have not followed if the issues are fully resolved, I would hope they are.


> UTF-8 is a relatively new thing in MSSQL and had lots of issues initially, I agree it's better and should have been implemented in the product long ago.

Their insistence on making the rest of the world go along with their obsolete pet scheme would be annoying if I ever had to use their stuff for anything ever. UTF-8 was conceived in 1992, and here we are in 2026 with a reasonably popularly database still considering it the new thing.


I would be more critical of Microsoft choosing to support UCS-2/UTF-16 if Microsoft hadn't completed their implementation of Unicode support in the 90s and then been pretty consistent with it.

Meanwhile Linux had a years long blowout in the early 2000s over switching to UTF-8 from Latin-1. And you can still encounter Linux programs that choke on UTF-8 text files or multi-byte characters 30 years later (`tr` being the one I can think of offhand). AFAIK, a shebang is still incompatible with a UTF-8 byte order mark. Yes, the UTF-8 BOM is both optional and unnecessary, but it's also explicitly allowed by the spec.


It's not really a Linux vs MS thing though. When Unicode first came out, it was 16-bit, so all the early adopters went with that. That includes Java, Windows, JavaScript, the ICU lintaries, LibreOffice and its predecessors, .NET, the C language (remember wchar_t?), and probably a few more.

Utf8 turned out to be the better approach, and it's slowly taking over, but it was not only Linu/Unix that pushed it ahead, the entire networking world did, especially http. Props also to early perl for jumping straight to utf8.

Still... Utf8's superiority was clear enough by 2005 or so, MS could and should have seen it by then instead of waiting until 2019 to add utf8 collations to its database. Funny to see Sql Server falling behind good old Mysql on such a basic feature.


Database systems are inherently conservative -- once you add something you have to support it forever. Microsoft went hog wild on XML in the database and I haven't seen it used in over a decade now.

In 92 it was a conference talk. In 98 it was adopted by the IETF. Point probably stands though.

the data types were introduced with SQL Server 7 (1998) so i’m not sure it’s accurate to state that it’s considered as the new thing.


thanks. now i see the point that the poster was making.

The one place UTF-16 massively wins is text that would be two bytes as UTF-16, but three bytes as UTF-8. That's mainly Chinese, Japanese, Korean, etc...

I love this design language to death. I know a lot of engineers prefer a no-frills, straight to the point readme (as reflected in these comments), and I get that. But I also don't want to live in a world made out of nothing but boxes.

It feels a bit like visiting Fallingwater and complaining that there are no arrows pointing to the bathroom.


Not to worry, ChatGPT will be happy to oblige and make every other repo look as unique and special as this one.

I love regex101.com, so really happy to see it breaks the mold here.

There is a self-hosted live sync plugin. It's rough around the edges but it mostly works and is actively maintained, if you are willing to self-host a sync server.

I say mostly works, because there are a lot of "gotchas" and the configuration and set up are a bit intimidating for the clients (the server is simple to host).

I used it for a while and it was fine, but I decided the cost of a coffee per month is worth not having to maintain it, and I switched to paying for their sync service.

However, there is also a git sync plugin that works really nicely. But it is not a real-time sync and it is not supported on mobile (officially). I mainly use that as a way to keep long running backups of my vaults in a self-hosted gitea instance (the default paid tier only keeps one month of history).


- Spam a product/service

- Generate age so spamming a product/service is easier and the account appears more trustworthy

- Influence discussions in a particular direction for monetary gain, i.e. "I got rich on bitcoin, you'd be crazy not to invest".

- Influence discussions in a particular direction for political gain, i.e. "I went to Xinjiang and the Uyghurs couldn't be happier!"


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: