Frederik Schmitt

About

I am a Ph.D. student in the Reactive Systems Group at CISPA Helmholtz Center for Information Security and at the Computer Science Department of Saarland University.

My research targets the integration of neural and symbolic reasoning, with major applications to automated verification. Key results are:

  • end-to-end generation of proof certificates when learning formal logics.
  • first deep learning solution for synthesizing circuits from temporal logic.
  • first interactive formalization of natural language to temporal specifications with LLMs.
+49 681 87083 2462
CISPA, Saarbrücken
E9 1 / 1.09
Frederik Schmitt
Publications
    Learning Better Representations from Less Data for Propositional Satisfiability
    Mohamed Ghanem, Frederik Schmitt, Julian Siber and Bernd Finkbeiner · NeurIPS 2024 · Spotlight Paper

    Training neural networks on NP-complete problems typically demands very large amounts of training data and often needs to be coupled with computationally expensive symbolic verifiers to ensure output correctness. In this paper, we present NeuRes, a neuro-symbolic approach to address both challenges for propositional satisfiability, being the quintessential NP-complete problem. By combining certificate-driven training and expert iteration, our model learns better representations than models trained for classification only, with a much higher data efficiency - requiring orders of magnitude less training data. NeuRes employs propositional resolution as a proof system to generate proofs of unsatisfiability and to accelerate the process of finding satisfying truth assignments, exploring both possibilities in parallel. To realize this, we propose an attention-based architecture that autoregressively selects pairs of clauses from a dynamic formula embedding to derive new clauses. Furthermore, we employ expert iteration whereby model-generated proofs progressively replace longer teacher proofs as the new ground truth. This enables our model to reduce a dataset of proofs generated by an advanced solver by around 32% after training on it with no extra guidance. This shows that NeuRes is not limited by the optimality of the teacher algorithm owing to its self-improving workflow. We show that our model achieves far better performance than NeuroSAT in terms of both correctly classified and proven instances.

    PDF
    Details
    NeuroSynt: A Neuro-symbolic Portfolio Solver for Reactive Synthesis
    Matthias Cosler, Christopher Hahn, Ayham Omar and Frederik Schmitt · TACAS 2024

    We introduce NeuroSynt, a neuro-symbolic portfolio solver framework for reactive synthesis. At the core of the solver lies a seamless integration of neural and symbolic approaches to solving the reactive synthesis problem. To ensure soundness, the neural engine is coupled with model checkers verifying the predictions of the underlying neural models. The open-source implementation of NeuroSynt provides an integration framework for reactive synthesis in which new neural and state-of-the-art symbolic approaches can be seamlessly integrated. Extensive experiments demonstrate its efficacy in handling challenging specifications, enhancing the state-of-the-art reactive synthesis solvers, with NeuroSynt contributing novel solves in the current SYNTCOMP benchmarks.

    Neural Circuit Synthesis with Pre-trained Language Models
    Frederik Schmitt, Matthias Cosler and Bernd Finkbeiner · DAV 2023

    This extended abstract reports preliminary results on fine-tuning pre-trained language models for solving reactive synthesis problems end-to-end. In recent work, hierarchical Transformer neural networks have been successfully trained from scratch to synthesize sequential circuits directly out of formal specifications. We improve over existing approaches by fine-tuning CodeT5 models that have been pre-trained on both natural language and programming languages. Our experiments show improved generalization and sample efficiency compared to the previous approach.

    nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models
    Matthias Cosler, Christopher Hahn, Daniel Mendoza, Frederik Schmitt and Caroline Trippel · CAV 2023

    A rigorous formalization of desired system requirements is indispensable when performing any verification task. This often limits the application of verification techniques, as writing formal specifications is an error-prone and time-consuming manual task. To facilitate this, we present nl2spec, a framework for applying Large Language Models (LLMs) to derive formal specifications (in temporal logics) from unstructured natural language. In particular, we introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language: we utilize LLMs to map subformulas of the formalization back to the corresponding natural language fragments of the input. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization. The framework is agnostic to specific application domains and can be extended to similar specification languages and new neural models. We perform a user study to obtain a challenging dataset, which we use to run experiments on the quality of translations. We provide an open-source implementation, including a web-based frontend.

    Iterative Circuit Repair Against Formal Specifications
    Matthias Cosler, Frederik Schmitt, Christopher Hahn and Bernd Finkbeiner · ICLR 2023

    We present a deep learning approach for repairing sequential circuits against formal specifications given in linear-time temporal logic (LTL). Given a defective circuit and its formal specification, we train Transformer models to output circuits that satisfy the corresponding specification. We propose a separated hierarchical Transformer for multimodal representation learning of the formal specification and the circuit. We introduce a data generation algorithm that enables generalization to more complex specifications and out-of-distribution datasets. In addition, our proposed repair mechanism significantly improves the automated synthesis of circuits from LTL specifications with Transformers. It improves the state-of-the-art by 6.8 percentage points on held-out instances and 11.8 percentage points on an out-of-distribution dataset from the annual reactive synthesis competition.

    Neural Circuit Synthesis from Specification Patterns
    Frederik Schmitt, Christopher Hahn, Markus N. Rabe and Bernd Finkbeiner · NeurIPS 2021

    We train hierarchical Transformers on the task of synthesizing hardware circuits directly out of high-level logical specifications in linear-time temporal logic (LTL). The LTL synthesis problem is a well-known algorithmic challenge with a long history and an annual competition is organized to track the improvement of algorithms and tooling over time. New approaches using machine learning might open a lot of possibilities in this area, but suffer from the lack of sufficient amounts of training data. In this paper, we consider a method to generate large amounts of additional training data, i.e., pairs of specifications and circuits implementing them. We ensure that this synthetic data is sufficiently close to human-written specifications by mining common patterns from the specifications used in the synthesis competitions. We show that hierarchical Transformers trained on this synthetic data solve a significant portion of problems from the synthesis competitions, and even out-of-distribution examples from a recent case study.

    Deep Learning for Temporal Logics
    Frederik Schmitt, Christopher Hahn, Jens Kreber, Markus N. Rabe and Bernd Finkbeiner · AITP 2021

    Temporal logics are a well established formal specification paradigm to specify the behavior of systems, and serve as inputs to industrial-strength verification tools. We report on current advances in applying deep learning to temporal logical reasoning tasks, showing that models can even solve instances where competitive classical algorithms timed out.

    Teaching Temporal Logics to Neural Networks
    Christopher Hahn, Frederik Schmitt, Jens Kreber, Markus N. Rabe and Bernd Finkbeiner · ICLR 2021

    We study two fundamental questions in neuro-symbolic computing: can deep learning tackle challenging problems in logics end-to-end, and can neural networks learn the semantics of logics. In this work we focus on linear-time temporal logic (LTL), as it is widely used in verification. We train a Transformer on the problem to directly predict a solution, i.e. a trace, to a given LTL formula. The training data is generated with classical solvers, which, however, only provide one of many possible solutions to each formula. We demonstrate that it is sufficient to train on those particular solutions to formulas, and that Transformers can predict solutions even to formulas from benchmarks from the literature on which the classical solver timed out. Transformers also generalize to the semantics of the logics: while they often deviate from the solutions found by the classical solvers, they still predict correct solutions to most formulas.

    Transformers Generalize to the Semantics of Logics
    Christopher Hahn, Frederik Schmitt, Jens Kreber, Markus N. Rabe and Bernd Finkbeiner · arxiv preprint

    We show that neural networks can learn the semantics of propositional and linear-time temporal logic (LTL) from imperfect training data. Instead of only predicting the truth value of a formula, we use a Transformer architecture to predict the solution for a given formula, e.g., a variable assignment for a formula in propositional logic. Most formulas have many solutions and the training data thus depends on the particularities of the generator. We make the surprising observation that while the Transformer does not perfectly predict the generator’s output, it still produces correct solutions to almost all formulas, even when its prediction deviates from the generator. It appears that it is easier to learn the semantics of the logics than the particularities of the generator. We observe that the Transformer preserves this semantic generalization even when challenged with formulas of a size it has never encountered before. Surprisingly, the Transformer solves almost all LTL formulas in our test set including those for which our generator timed out.

Teaching Activities
Contact Data Privacy Policy Imprint
Home People Publications
More