MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_NextPart_01DB1029.CAD76960" This document is a Single File Web Page, also known as a Web Archive file. If you are seeing this message, your browser or editor doesn't support Web Archive files. Please download a browser that supports Web Archive. ------=_NextPart_01DB1029.CAD76960 Content-Location: file:///C:/EB2CB227/SteveOonSuperhumanTheoremProving.htm Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset="windows-1252"

Progress on Superhuman Theorem Proving?

Steve Omohundro, Founding Editor

Harmonic (which is working on "Mathematical Superintelligence") just closed their $75 million Series A round led by Sequoia Capital https://harmonic.fun/news The New York Times did a piece ab= out them and about the use of mathematical proof as a "path around hallucinations" as DeepMind's David Silver put it. He says "Proof is a form of truth". "Is Math the Path to Chatbots Th= at Don’t Make Stuff Up? Chatbots like ChatGPT get stuff wrong. But researchers= are building new A.I. systems that can verify their own math — and maybe more."

https://archive.ph/O7qaG

Here’s the full interview with the Harmonic guys: &quo= t;Why Vlad Tenev and Tudor Achim of Harmonic Think AI Is About to Change Math—and= Why It Matters"

https://www.youtube.com/watch?v=3DNvAxuCIBb-c&ab_c= hannel=3DSequoiaCapital

They hope that an AI-human collaboration might solve a Millennium Prize problem by 2028 and the Riemann hypothesis by 2029.

Vlad Tenev says "If you define it as something th= at can reason and solve math problems in excess of any human, something that would give Terry Tao a run for his money, I think we're a couple of years away. B= ut I think the models within the next year will get to the 99.99th percentile.&q= uot; If he is correct, I think this is a complete game changer, upending all of the basic assumptions on both the capabilities = and the safety sides of AI.

I've been discussing with the provably safe crew about whether superhuman theorem proving *next year* will advance provable safety= or cyberattack capabilities more rapidly. Here are some of my thoughts on the tradeoffs:

Typical current methods for attacking systems use tech= niques like "fuzzing": https://en.wikipedia.org/wiki/Fuzzing  to simulate a bunch of random attacks a= nd see if anything succeeds. The distribution of generated attacks can be selected using knowledge of the system and can be adapted as the search proceeds. But with a good theorem prover, the attacker can search over *subsets* of the attack space rather than just particular attacks. When a subset is proven s= afe, it can be removed from the space of candidate attacks. Branch and bound can then home in on the possible candidates and dramatically speed up the search for successful attacks, I think.

On the face of it, the defense task seems much harder:= now we have to search over both the space of possible designs and the space of attacks to find one for which we can prove that no attack can succeed. A priori, that double search seems much more challenging and expensive. But I think there are two things that improve things enormou= sly:

1) We can perform the two searches simultaneously and = use theorem proving to generalize discoveries. A successful attack doesn't just rule out one design, it eliminates a whole set of potential designs. And a proof that there are no attacks also applies to a whole set of designs.

2) We can build designs out of reusable components for= which we have already proven safety properties. If we have also proven the safety impacts of certain forms of composition, then many designs will just involve combining pre-proven components selected from a library and combining them = with pre-proven methods.

And I think we are also likely to create specification= s in that way: from component specs with proven properties and proven compositio= nal methods (eg. often just conjunction: "I wa= nt no memory leaks *and* no buffer overflows").

Similarly, models of attacker capabilities also seem naturally compositional.

I've been delving more into the Lean ecosystem and I'm becoming more and more impressed with it. It really seems pretty natural for the tasks that are needed for forma= lizaton and proofs of safety. And they've even done amazing things with the program= ming language implementation of things like memory management. There are tons of training materials which are generally well-written. But I haven't found practically oriented materials aimed at helping people formalize and prove properties of real systems. This book is somewhat practically oriented but still not quite what is needed, I think: The Hitchhiker's Guide to Logical Verification, 2024 LMU Desktop Edition, March 28, 2024

https://github.com/blanchette/interactive_theorem_prov= ing_2024/blob/main/hitchhikers_guide_2024_lmu_desktop.pdf

------=_NextPart_01DB1029.CAD76960 Content-Location: file:///C:/EB2CB227/SteveOonSuperhumanTheoremProving_files/themedata.thmx Content-Transfer-Encoding: base64 Content-Type: application/vnd.ms-officetheme UEsDBBQABgAIAAAAIQDp3g+//wAAABwCAAATAAAAW0NvbnRlbnRfVHlwZXNdLnhtbKyRy07DMBBF 90j8g+UtSpyyQAgl6YLHjseifMDImSQWydiyp1X790zSVEKoIBZsLNkz954743K9Hwe1w5icp0qv 8kIrJOsbR12l3zdP2a1WiYEaGDxhpQ+Y9Lq+vCg3h4BJiZpSpXvmcGdMsj2OkHIfkKTS+jgCyzV2 JoD9gA7NdVHcGOuJkTjjyUPX5QO2sB1YPe7l+Zgk4pC0uj82TqxKQwiDs8CS1Oyo+UbJFkIuyrkn 9S6kK4mhzVnCVPkZsOheZTXRNajeIPILjBLDsAyJX89nIBkt5r87nons29ZZbLzdjrKOfDZezE7B /xRg9T/oE9PMf1t/AgAA//8DAFBLAwQUAAYACAAAACEApdan58AAAAA2AQAACwAAAF9yZWxzLy5y ZWxzhI/PasMwDIfvhb2D0X1R0sMYJXYvpZBDL6N9AOEof2giG9sb69tPxwYKuwiEpO/3qT3+rov5 4ZTnIBaaqgbD4kM/y2jhdj2/f4LJhaSnJQhbeHCGo3vbtV+8UNGjPM0xG6VItjCVEg+I2U+8Uq5C ZNHJENJKRds0YiR/p5FxX9cfmJ4Z4DZM0/UWUtc3YK6PqMn/s8MwzJ5PwX+vLOVFBG43lExp5GKh qC/jU72QqGWq1B7Qtbj51v0BAAD//wMAUEsDBBQABgAIAAAAIQBreZYWgwAAAIoAAAAcAAAAdGhl bWUvdGhlbWUvdGhlbWVNYW5hZ2VyLnhtbAzMTQrDIBBA4X2hd5DZN2O7KEVissuuu/YAQ5waQceg 0p/b1+XjgzfO3xTVm0sNWSycBw2KZc0uiLfwfCynG6jaSBzFLGzhxxXm6XgYybSNE99JyHNRfSPV kIWttd0g1rUr1SHvLN1euSRqPYtHV+jT9yniResrJgoCOP0BAAD//wMAUEsDBBQABgAIAAAAIQBn gPy0jgcAAM0gAAAWAAAAdGhlbWUvdGhlbWUvdGhlbWUxLnhtbOxZS48buRG+B8h/aPRdVrceLWlg eaGnZ+0Z27BkB3vkSJSaHnazQVIzFhYGAu8plwABdhe5BMgthyDIAlkgi1zyYwzYSDY/IkV2q0VK lOcBAzGCmbmo2V8VP1YVq6rJ+1+8Tqh3gbkgLO364b3A93A6Y3OSLrv+i+m40vY9IVE6R5SluOuv sfC/ePDLX9xHRzLGCfZAPhVHqOvHUmZH1aqYwTAS91iGU3i3YDxBEh75sjrn6BL0JrRaC4KomiCS +l6KElD7dLEgM+xNlUr/wUb5iMJjKoUamFE+UaqxJaGx8/NQIcRaDCj3LhDt+jDPnF1O8WvpexQJ CS+6fqD//OqD+1V0VAhReUDWkBvrv0KuEJif1/ScfHlWThqMau1GWOrXACr3caO2+i/1aQCazWCl ORdTZ9iMgnatwBqg/KdDd6cV1m28ob++xznsRP1aw9KvQbn+xh4+GHdGw6aF16Ac39zD94Jav1O3 8BqU46M9fGPUa9VGFl6DYkrS83101Gq3owJdQhaMHjvhnSgKWsMCvkVBNJTRpaZYsFQeirUEvWJ8 DAAFpEiS1JPrDC/QDKK4l0kmvCERGUVr38tQygQMB7UwhNBrBLXyX1scHWFkSCtewETsDSk+nphx ksmu/wi0+gbk/U8/vXv747u3f3/3zTfv3v7VOyHLWOaqLLljlC5NuZ//9Lv//OHX3r//9sefv/3O jRcm/sNffvPhH//8mHrYaltTvP/+hw8//vD+97/915+/dWjvcXRmwqckwcJ7gi+95yyBBWpT2Pzx Gb+ZxDRGxJTopUuBUqRmcegfydhCP1kjihy4Prbt+JJDqnEBH65eWYQnMV9J4tD4OE4s4CljtM+4 0wqP1VyGmaerdOmenK9M3HOELlxzD1BqeXm0yiDHEpfKQYwtms8oSiVa4hRLT71j5xg7VvcVIZZd T8mMM8EW0vuKeH1EnCaZkjMrmrZCxyQBv6xdBMHflm1OX3p9Rl2rHuILGwl7A1EH+SmmlhkfopVE iUvlFCXUNPgJkrGL5GTNZyZuJCR4eokp80ZzLIRL5imH9RpOf4wguzndfkrXiY3kkpy7dJ4gxkzk kJ0PYpRkLuyEpLGJ/VKcQ4gi7xmTLvgps3eIegY/oPSgu18SbLn76mzwArKcSWkbIOrNijt8+RAz K34na7pA2JVqejyxUmyPE2d09FdLK7RPMKboEs0x9l586WDQZ5ll8y3pRzFklWPsCqxHyI5V9Zxi Ab2Sam728+QJEVbITvCSHeBzut5JPGuUJogf0vwEvG7afHTGYTM6KDyls3MT+IRADwjx4jTKUwE6 jOA+qPVZjKwCpp6FO17X3PLfdfYY7MtXFo1r7EuQwTeWgcRuynzUNlNErQm2ATNFxDtxpVsQsdy/ FVHFVYutnHILe9Nu3QDdkdX0JCS9ogP633Q+jkD8ND2PW7GVsG7Y7RxKKMc7Pc4h3G5nM2B8Tj7/ xmaIVukzDLVkP2vd9TV3fY3/f9/XHNrPd93MoZ7jrpvxocu462aKA5ZP081sGxjobdQhQ37Yo49+ koMnPwtC6USuKT4R+vBHwDfNfAyDSk6feuLyJDCL4acqczCBhVtypGU8zuSviIwnMcrghCj0lZKl KFQvhZcxAQdHetipW+HpKjll8/zAU58wBXllFUhux4MmHD3l43BYJXN01CoGFT99qgp8NdulPmzd EFCyNyFhTGaTqDtItDaDV5BQZ2efhkXHwaKt1G9ctWcKoFZ6BT66PfhU7/rNhiIEJ+ViBg36XPkp d/XGu9qZn9LTh4xpRQAcLuYrgaP50tMdxfXg8tTq8lC7hqctEtopeVjZJLRldIMnYvgULqJTjV6H xk193dm61KKnTKHng/je0mi1P8bitr4Gud3cQFMzU9DUu+z6Ub0JITNDWddfwMEx/EwyiB2hvrsQ XcLty0zyfMPfJrNkXMghEnFucJ10cvckRGLuUZJ0fbX80g001TlEcwtrkBA+W3IdSCufGzlwuu1k vFjgmTTdbowoS+ePkOHzXOF8q8VvD1aSbAXunsTzS++MrvhzBCHWbIXKgHMi4P4gzK05J3AhViay bfztFKYi+Zs3UjqG8nFEsxgVFcVM5jlc15OSjn4qbWA8FWsGgxomKQrh2VIVWNOoVjUtS1fO4WDV vVpIWc5ImtuaaWUVVTXdWcyaYVMGdmx5uyJvsNqYGHKaWeHz1L2bcjubXLfTJ5RVAgxe2u92pd+g tp3MoqYY76dhlbOLUbt2bBZ4BbXrFAkj60cbtTt2K2uEczoYvFXlB7ndqIWhxaav1JbWN+fm5TY7 ewXJYwhd7opKoV0Jp7scQVc20T1JnjZgi7yWxdaAX96Kk67/ddDsNQa15qAStJujSqPeCCrtZq9e 6TWb9XDUDINhv/YGCouMk7CZ39qP4RKDrou7ez2+d3+fbO5p7s1YUmX6fr6qiev7+7B2+P7eI5B0 vo5q4069048qnXpvXGkM++1KZxD1K8No0BqOh4NmuzN+43sXGtzo1QeNaNSuROFgUGlEgaLf7lRa jVqt12j12qNG703RxsDK8/RR2ALMq3k9+C8AAAD//wMAUEsDBBQABgAIAAAAIQAN0ZCftgAAABsB AAAnAAAAdGhlbWUvdGhlbWUvX3JlbHMvdGhlbWVNYW5hZ2VyLnhtbC5yZWxzhI9NCsIwFIT3gncI b2/TuhCRJt2I0K3UA4TkNQ02PyRR7O0NriwILodhvplpu5edyRNjMt4xaKoaCDrplXGawW247I5A UhZOidk7ZLBggo5vN+0VZ5FLKE0mJFIoLjGYcg4nSpOc0IpU+YCuOKOPVuQio6ZByLvQSPd1faDx mwF8xSS9YhB71QAZllCa/7P9OBqJZy8fFl3+UUFz2YUFKKLGzOAjm6pMBMpburrE3wAAAP//AwBQ SwECLQAUAAYACAAAACEA6d4Pv/8AAAAcAgAAEwAAAAAAAAAAAAAAAAAAAAAAW0NvbnRlbnRfVHlw ZXNdLnhtbFBLAQItABQABgAIAAAAIQCl1qfnwAAAADYBAAALAAAAAAAAAAAAAAAAADABAABfcmVs cy8ucmVsc1BLAQItABQABgAIAAAAIQBreZYWgwAAAIoAAAAcAAAAAAAAAAAAAAAAABkCAAB0aGVt ZS90aGVtZS90aGVtZU1hbmFnZXIueG1sUEsBAi0AFAAGAAgAAAAhAGeA/LSOBwAAzSAAABYAAAAA AAAAAAAAAAAA1gIAAHRoZW1lL3RoZW1lL3RoZW1lMS54bWxQSwECLQAUAAYACAAAACEADdGQn7YA AAAbAQAAJwAAAAAAAAAAAAAAAACYCgAAdGhlbWUvdGhlbWUvX3JlbHMvdGhlbWVNYW5hZ2VyLnht bC5yZWxzUEsFBgAAAAAFAAUAXQEAAJMLAAAAAA== ------=_NextPart_01DB1029.CAD76960 Content-Location: file:///C:/EB2CB227/SteveOonSuperhumanTheoremProving_files/colorschememapping.xml Content-Transfer-Encoding: quoted-printable Content-Type: text/xml ------=_NextPart_01DB1029.CAD76960 Content-Location: file:///C:/EB2CB227/SteveOonSuperhumanTheoremProving_files/filelist.xml Content-Transfer-Encoding: quoted-printable Content-Type: text/xml; charset="utf-8" ------=_NextPart_01DB1029.CAD76960--