F*: A proof oriented general purpose programming language

254 points by akkad33 6 months ago

I studied formal languages for ~2 years and have professional experience programming coq. The real benefit of this language, over other formal languages is the focus on being able to write real programs in it. Most theorem proving languages are focused on mathematics or proving things about a program, and they are very abstract. This language appears to have a goal of bridging the gap and making it simple to write programs and prove parts of them. I believe this is the future of formal languages. If you write something in Rust, it would be great to prove aspects about it. Why not? Well F* is a vision if this future. As proof automation gets better, we will be able to prove many things about our programs, something which is not typically attempted due to difficulty. For instance, imagine proving that a certain path has no allocations, spawns no threads, cannot deadlock, preserves privacy, only modifies a certain subset of global data, is a specific algorithmic complexity, is not subject to timing attacks, maintains the conference of a cache, is memory safe (without a borrow checker). The limit is your imagination and Rice's theorem.

akkad33 6 months ago

https://dafny.org/ also allows proof checking and allows do write real programs with it. It has a java like syntax and is also from MS I believe
- nextos 6 months ago
  
  Having used both Dafny and F* quite extensively, Dafny, and its predecessor Spec#, are simple and practical, thanks to being based on Hoare logic (contracts).
  It's a great place to start with verification, as proofs are discharged to SAT/SMT, so things are automated. It can get a bit frustrating when automation is not able to prove things, that's the major disadvantage of SAT/SMT.
  But it's not a toy. Some of the largest verification efforts have been done in Dafny. See e.g. IronFleet [1].
  [1] https://www.andrew.cmu.edu/user/bparno/papers/ironfleet.pdf
  - fovc 6 months ago
    
    Agree it’s not a toy. AWS implemented a large chunk of IAM in Dafny. Though IIRC they have their own non-public compiler to Java
  - gsbcbdjfncnjd 6 months ago
    
    F* also uses SAT/SMT, specifically Z3.
    
    nextos 6 months ago
    
    Sure, what I meant is that Dafny outsources everything to SAT/SMT. If it doesn't manage to prove things for you, it gets a bit frustrating as there is not much support for manual tactics compared to F*, Coq or Isabelle, which can also leverage SAT/SMT, but have other options.
nickpsecurity 6 months ago

I’ll add that SPARK Ada allows this now, has commercial tooling from Adacore, and was field-prove in years of commercial deployments.
https://en.m.wikipedia.org/wiki/SPARK_(programming_language)
It builds on the Why3 platform which also supports Frama-C for the C language. IIRC, SPARK can be compiled to C for C-only, runtime targets as well.
markusde 6 months ago

Do you think you might be able to elaborate a little bit more about this?
I was skimming the "Proof-Oriented Programming" book, and it seems that the primary way to execute F* programs is by extraction or trusted compilation (same as Rocq and Lean, for example). Does F* have some special way to work directly with realistic source code that these other systems don't?
- seeknotfind 6 months ago
  
  F* is a programming language with proof support. Lean/Coq are theorem providers that can be used to model and generate code. In Coq, there's not a standard way to generate code, you might model and create code many different ways. F* seems to bring this in as the central goal of the language. So I'm discussing this structural difference. F* has an SMT solver, but Lean can import an SMT solver. So the goal is the important difference here - making theorem proving as accessible as possible in a real language. It's a move in the right direction, but we want to get to the point standard program languages have this support if you need it.
  - oisdk 6 months ago
    
    > F* is a programming language with proof support. Lean/Coq are theorem providers that can be used to model and generate code.
    Lean is also a programming language with proof support. It is very much in the same category as F* in this regard, and not in the same category as Coq (Rocq).
    
    clarus 6 months ago
    
    In Rocq/Coq, you have "extraction", which is the standard way to compile programs. This is how the C compiler CompCert is executed, for example. So, all of these languages are in the same category in this respect.
guerrilla 6 months ago

> I studied formal languages for ~2 years and have professional experience programming coq. The real benefit of this language, over other formal languages is the focus on being able to write real programs in it.
How does Idris compare these days?
rajamaka 6 months ago

Where would a smooth brained CRUD app web developer who failed high school maths perhaps learn more about the point of this?
- seeknotfind 6 months ago
  
  The point would be to prove that your database stays consistent, a DB upgrade wouldn't break, or that you could achieve 100% up time given certain assumptions. An automated solver/prover might even be able to tell you where in your program breaks this. Without a technology like this, you have to carefully read your program, but this is the methodology to make the computer deal with that. Unfortunately formal verification is mostly used for research, F* is framing itself as taking a step to bring that to production codebases. Unfortunately I don't know good resources, but as researchers adapt this technology, and as we automate a lot of the heavy mathematics, you'd never want a programming language without this option.
newpavlov 6 months ago

>memory safe (without a borrow checker)
This sounds like a very weird statement.
In my opinion, a proper proof-oriented language would inevitably contain a borrow checker variant as a subset of its functionality (assuming the language has notion of pointers/references and focuses on achieving "zero cost abstractions"). Lifetime annotations and shared/exclusive references provide essential information and code use restrictions which enable automatic proof generation.
After all, type and borrow checkers are nothing more than limited forms of automatic proof generation which are practical enough for wide use.

thunkingdeep 6 months ago

In what situations would one prefer this vs Lean?

This seems to compile to native code if desired, so does that mean it’s faster than Lean?

Forgive me if these are obvious questions, I’m just curious and on my phone right now away from my machine.

nextos 6 months ago

Lean has very little support for proving things about software. Right now, the community is mainly geared towards mathematics. This could change. Concrete Semantics was rewritten in Lean [1], but I haven't seen more efforts geared towards software in the Lean community.
Dafny, Isabelle, Why3, Coq and F* have been used to verify non-trivial software artifacts. Liquid Haskell, Agda and others are also interesting, but less mature.
[1] https://browncs1951x.github.io/static/files/hitchhikersguide...
- TypingOutBugs 6 months ago
  
  Agda and Liquid Haskell are used for Cardano, one of the largest blockchain platforms, alongside other tooling in that space. It’s one of the larger formally verified projects in the wild so I’d argue it’s fairly mature.
  For example their formal specification of their ledger system:
  https://drops.dagstuhl.de/storage/01oasics/oasics-vol118-fmb...
markusde 6 months ago

One technical difference is that F* heavily uses SMT automation, which is less emphasized in Lean (their book even says that F* typechecking is undecidable). F* programmers frequently talk about the language's emphasis on monadic programming, which I'll admit that I don't understand (but would like to!)
- ijustlovemath 6 months ago
  
  As long as you understand that a monad is a monad, you should be fine!
cess11 6 months ago

Looking at publications on the home pages for both projects it seems F* results in more practical stuff like hardened cryptography and memory allocation libraries, while Lean presents itself as more of an experiment in developing proof assistants.
jey 6 months ago

> F* is oriented toward verified, effectful functional programming for real-world software, while Lean is geared toward interactive theorem proving and formalizing mathematical theories, though both support dependent types and can serve as general-purpose functional languages in principle.
pjmlp 6 months ago

Lean also started at MSR, and nowadays is bootstraped, they use different approaches.

IshKebab 6 months ago

I tried to learn F* but honestly it was quite confusing with all the different sub-languages and variants.

agnishom 6 months ago

I think Lean is another closely related language. It is very famous for its theorem proving capabilities, of course, but it seems that it can also express effectful computations (aka, real world programs).

Somebody managed to solve Advent of Code with Lean: https://github.com/anurudhp/aoc2022

Somebody managed to write a raytracer with Lean: https://github.com/kmill/lean4-raytracer

BoingBoomTschak 6 months ago

Never used it, but https://github.com/project-everest/mitls-fstar always seems an incredibly cool thing. Huh, apparently there's a successor to F* called F7 (https://github.com/mitls/mitls-flex) ?

mlinksva 6 months ago

Looks like it's a predecessor https://www.microsoft.com/en-us/research/project/f7-refineme...
- BoingBoomTschak 6 months ago
  
  Oh yeah, didn't look at the dates. It's maybe dead, then.
Taikonerd 6 months ago

Is Project Everest still going? The GitHub link says that the repo has been archived...

nottorp 6 months ago

I wonder if it's a good idea to pick a name that's hard to search for...

wk_end 6 months ago

Is it really that much harder to search for than "C", "C++", "C#", "F#", "Go", "Rust", "D"...?
Googling "Fstar programming language" works fine for me.
- nottorp 6 months ago
  
  Didn't say those are more inspired. At least Rust is long enough to not get ignored by the search engines. And some of the older ones have being older than the internet as an excuse.
- amelius 6 months ago
  
  ".NET"
xedrac 6 months ago

For a language that touts its practicalities, the name isn't a great start. Although F*lang or F***lang (maybe Foq?) seem like reasonable search proxies.
- markusde 6 months ago
  
  Foq is hilarious, especially given that just today Coq released its website with its new name (Rocq)
fosefx 6 months ago

Just like with "golang", "FStar" is the query if choice. But don't think you'll find much, if the documentation is in the same state as it was two years ago.
amelius 6 months ago

...and that looks like someone is swearing.

aranchelk 6 months ago

Anyone coming from a Haskell background, there’s some interesting (though possibly dated) info here:

https://www.reddit.com/r/haskell/comments/76k1x0/what_do_you...

kittikitti 6 months ago

This is great and could provide additional guardrails on top of large language models. I imagine we could provide a more rigorous approach to prove that a LLM can behave in certain ways within a workflow.

ijustlovemath 6 months ago

LLMs are fundamentally probabilistic in nature; I think you could prove things about distributions of outputs but nothing like the kind of formal verification you can create with Lean or Rust.
TypingOutBugs 6 months ago

There’s a few papers in this space but it’s still early days, both using Formal Verification to test outputs of LLMs and using LLMs in formal verification itself
Examples of both:
https://sumitkumarjha.com/papers/2023_ICAA_LLM_Dehallucinati...
https://mathai2023.github.io/papers/28.pdf
xigoi 6 months ago

An LLM is pretty much a black box. You can’t prove anything meaningful about its output.
IceDane 6 months ago

You are really comparing apples to oranges here, and to say that trying to apply formal verification to a probabilistic lying machine is silly would be a colossal understatement.
palata 6 months ago

Isn't it a fundamental limitation of LLMs that we can't?
- rscho 6 months ago
  
  Obviously not. It's right in the name, isn't it? LLM = limitless model...

vb-8448 6 months ago

I wonder if the proliferation of programming languages we saw in the last decade is due to the fact that nowadays it's extremely easy to create one or that existing ones sucks?

mpweiher 6 months ago

I think it's because (a) it's become a lot easier to create languages and (b) we're stuck.
I am hoping (a) is straightforward. For (b), I think most of us sense at least intuitively, with different strengths, that our current crop of programming languages are not really good enough. Too low-level, too difficult to get to a more compact and meaningful representation.
LLMs have kinda demonstrated the source for this feeling of unease, by generating quite a bit of code from fairly short natural language instructions. The information density of that generated code can't really be all that high, can it now?
So we have this issue with expressiveness, but we lack the insight required to make substantial progress. So we bounce around along the well-trodden paths of language design, hoping for progress that refuses to materialize.
- TypingOutBugs 6 months ago
  
  F* is a research project thats almost 15 years old! Its not part of the recent wave of languages
  - mpweiher 6 months ago
    
    Cool!
    Though I'd say we've been stuck since about the 80s.
gwervc 6 months ago

F* is older than a decade. Also it's a research project, so quite different than the random "memory safe" language iteration of the day.
shakna 6 months ago

I think it's closer to an availability thing. A lot of businesses used to run on privately built programming languages. And everyone had their own BASIC implementation that wasn't quite compatible with everyone else. But transferring the private language across to someone on the other side of the world was difficult.
It did happen over early things like IRC and the like, but the audience was limited, and the data was ephemeral.
For example, Byte Magazine has an article on writing your own assembler in 1977 [0]: "By doing it yourself you can learn a lot about programming and software design as well as the specs of your own microcomputer, save yourself the cost of program development, and produce a customized language suited to your own needs or fancies."
Programmers have been creating their own languages since the beginning. It's part of why we lean so heavily towards standards - so other people don't break the things we've already built.
[0] https://archive.org/details/best_of_byte_volume_1_1977_06
hnlmorg 6 months ago

Neither. It’s always been easy to create a language. What’s been hard is promoting that language.
The things that’s changed is that these days you have a combination of more people using computers plus more convenient forums (like this) for people to promote their new language.
int_19h 6 months ago

Is there really a proliferation of PLs that hasn't been the case before? I recall many new languages back in 00s as well, it's just that most of them didn't live long enough for people to remember them now.

madihaa 6 months ago

(old discussion) https://news.ycombinator.com/item?id=40377685

spartakos87 6 months ago

Where can someone be started with this kind of programming? System verification etc, books? youtube lectures?

tdiff 6 months ago

I believe future of programming would vaguely be some mix of intelligent code generation (e.g. LLMs) + formal verification. "LLMs" would be the compiler of the future, something like gcc today, converting "human" language into machine code.

shakna 6 months ago

For that, we'd have to have deterministic outputs from the code generator. Which is exactly what the current prospective AI fields don't do. And what the current LLMs cannot.
GoblinSlayer 6 months ago

I wonder why AI doesn't do this verification already. It's a sufficiently simple routine task that doesn't require great precision.
- IceDane 6 months ago
  
  Formal verification of arbitrary code is a "simple, routine task"?
  - GoblinSlayer 6 months ago
    
    Informal verification.

kodemongos 6 months ago

[dead]

deknos 6 months ago

Nice, but as long as at least core and minimal standard library is not really opensource, well maintained and first class citizen on linux, it will not get real distribution.

Even C# suffers from that, though it is a nice language.

For a programming language to have success it needs to be

* opensource * well maintained * first class citizen on win/lin/mac