Tag: EA

  • Classifying Language Models along Autonomy & Trust Levels

    The Problem

    Language models are everywhere now. People praise them, but also complain about responses—unreliable, hallucination, cannot let it work alone, and so on. These systems, capable of understanding and generating human-like text, are often called copilots—a term borrowed from aerospace or car racing. That term indicates their main expected role being support for the pilot.

    But how do we actually classify what these models can do? And more importantly, how much can we trust them?

    A Hybrid Classification Framework

    Drawing inspiration from the SAE levels of driving automation and grounded in human-computer interaction research on trust in automation, we propose a two-dimensional framework for classifying language models:

    1. Operational Autonomy – adapted from SAE Levels (0–5): What can the model do on its own?
    2. Cognitive Trust and Delegation – how much mental effort does the user expend, and how much responsibility is delegated?

    Each level in the chart below reflects both dimensions.

    LevelAutonomy DescriptionTrust/Delegation Role
    0 – Basic SupportPassive tools like spellcheckers; no real autonomyNo Trust: User must fully control and interpret everything
    1 – Assisted GenerationSuggests words or phrases (autocomplete); constant oversight neededSuggestive Aid: User supervises and approves each suggestion
    2 – Semi-Autonomous Text ProductionGenerates coherent content from prompts (emails, outlines); needs close supervisionCo-Creator: User relies in low-stakes tasks but reviews all outputs
    3 – Context-Aware AssistanceCan handle structured tasks (e.g., medical summaries); users remain alertDelegate: User lets go during routine tasks but monitors for failure
    4 – Fully Autonomous Within DomainsWorks independently in narrow contexts (e.g., customer service bot)Advisor: Trusted within scope; user rarely intervenes
    5 – General Language AgentHypothetical general-purpose assistant capable across domains without oversightAgent: Fully trusted to operate independently and responsibly

    Why SAE Levels Make Sense

    While not acting in the physical world, it makes perfect sense to compare language models to autonomous vehicles in terms of their capabilities and limitations. The SAE classification helps clarify expectations, safety considerations, and technological milestones.

    Let’s first briefly revisit what each SAE level entails for automobiles:

    • Level 0 (No Automation): The human driver does everything; no automation features assist with driving beyond basic warnings.
    • Level 1 (Driver Assistance): The vehicle offers assistance with either steering or acceleration/deceleration but requires constant oversight.
    • Level 2 (Partial Automation): The system can manage both steering and acceleration but still requires the human to monitor closely.
    • Level 3 (Conditional Automation): The vehicle handles all aspects of driving under specific conditions; the human must be ready to intervene if necessary.
    • Level 4 (High Automation): The car can operate independently within designated areas or conditions without human input.
    • Level 5 (Full Automation): Complete autonomy in all environments—no human intervention needed.

    Adapting Levels to Language Models

    Level 0: Basic Support

    At this foundational level, language models serve as simple tools—spell checkers or basic chatbots—that provide minimal assistance without any real understanding or autonomy. They do not generate original content on their own but act as aids for humans who make all decisions.

    Example: Elementary grammar correction programs that flag mistakes but don’t suggest nuanced rewrites.

    Level 1: Assisted Generation

    Moving up one step, some language models begin offering suggestions based on partial input. For example, autocomplete functions in email clients that predict next words or phrases fall into this category—they assist but require constant supervision from users who must review outputs before accepting them.

    Example: Gmail’s smart compose feature.

    Level 2: Semi-Autonomous Text Production

    At this stage, models can generate longer stretches of coherent text when given prompts—think about AI tools that draft emails or outline articles—but they still demand continuous oversight. Users need to supervise outputs actively because errors such as factual inaccuracies or inappropriate tone remain common pitfalls.

    Example: ChatGPT generating email drafts or article outlines.

    Level 3: Context-Aware Assistance

    Now we reach an intriguing analogy with conditional automation—where AI systems can handle complex tasks within certain constraints yet require humans to step back temporarily while remaining alert for potential issues. Large language models operating at this level might manage summarization tasks under specific domains (e.g., medical summaries) but could falter outside their trained scope.

    Example: Medical AI assistants that can summarize patient records but require doctor oversight.

    Level 4: Fully Autonomous Within Domains

    Imagine an AI-powered assistant capable of managing conversations entirely within predefined contexts—say customer service bots handling standard inquiries autonomously within specified industries—but unable beyond those limits without retraining or manual intervention.

    Example: Customer service chatbots for specific industries like banking or retail.

    Level 5: Fully Autonomous General Language Understanding

    Envisioning true “full autonomy” for language models means creating systems that understand context deeply across countless topics and produce accurate responses seamlessly everywhere—all without prompting from humans if desired. While such systems remain theoretical today, research aims toward developing general-purpose AI assistants capable not only of conversing fluently across domains but doing so responsibly without oversight.

    Example: Theoretical future AI systems that could operate across all domains without human oversight.

    Current State and Implications

    Now that we have a clear classification framework, let’s examine where we stand today and what this means for practical applications.

    What does this classification tell us about our current standing? Most contemporary large-scale language models sit somewhere around Levels 2 or early-Level 3—they generate impressive content when given prompts yet still struggle with consistency outside narrow contexts and require vigilant supervision by humans who evaluate accuracy critically.

    However, there’s an important limitation to the SAE analogy that we need to address.

    The Trust Dimension

    While the SAE levels offer a useful metaphor for understanding increasing autonomy, they aren’t a perfect fit for language models because:

    • Language models don’t act in the physical world themselves—humans interpret and act on their outputs
    • Risk and impact in NLP are mediated by human cognition and behavior, unlike the immediate physical risks of self-driving cars
    • Autonomy in NLP often deals more with semantic understanding, trustworthiness, context handling, and ethical alignment than sensor-actuator loops

    Therefore, I also propose a mapping of the SAE levels to trust levels taking into account cognitive load and responsibility:

    • Level 0: No trust: tool offers isolated corrections, requires full user oversight (spellcheck)
    • Level 1: Suggestive aid: user must review and approve every suggestion (autocomplete)
    • Level 2: Co-creator: user maintains active oversight, only defers in low-stakes contexts (drafting emails)
    • Level 3: Delegate: user maintains regular oversight with frequent spot checks and validation (10-20% review)
    • Level 4: Advisor: user maintains strategic oversight with periodic reviews (5-10% audit), especially for high-stakes outputs
    • Level 5: Agent: user maintains governance oversight with systematic audits (1-5% review) despite autonomous operation

    Practical Implications

    Classifying language models along SAE-like levels provides practical benefits:

    1. Common vocabulary for developers, researchers, policymakers, and end-users
    2. Realistic expectations about capabilities—the difference between tools assisting writing versus fully automating complex decision-making processes
    3. Regulatory guidance for ensuring safe deployment at each stage
    4. Effort per level is increasing, probably exponentially

    Design Priorities

    It’s vital not simply to categorize these technologies for academic interest but also because such clarity informs design priorities:

    • Should future efforts focus on improving reliability before granting more independence?
    • How do safety concerns evolve as we move up each level?
    • What ethical considerations arise when deploying increasingly autonomous NLP systems?

    Each incremental step toward higher levels demands careful consideration regarding:

    • Transparency: Can users understand when they’re interacting with an assistant versus an agent?
    • Accountability: Who bears responsibility if an AI-generated statement causes harm?

    Conclusion

    Applying SAE-level classifications offers more than just terminology—it provides a roadmap illustrating how far we’ve come and how much further we need to go in developing intelligent language systems capable not only of mimicking human conversation but doing so responsibly across diverse environments.

    Recognizing where current technology resides on this spectrum enables us all—from engineers designing smarter assistants to regulators crafting informed policies—to make conscious choices grounded in realistic assessments rather than hype or fear.

    As artificial intelligence continues its ascent along these levels—from rudimentary support towards full autonomy—the journey will demand ongoing collaboration among technologists, ethicists, policymakers, and ultimately society itself to ensure these powerful tools serve humanity’s best interests every step along the way.

    References

    SAE J3016™. “Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles.” First published: 2014. Most recent version (as of 2024): SAE J3016_202104 (April 2021). 🔗 https://www.sae.org/standards/content/j3016_202104/

    Hoffman, R. R., Johnson, M., Bradshaw, J. M., & Underbrink, A. (2013). “Trust in automation.” IEEE Intelligent Systems, 28(1), 84–88. DOI: 10.1109/MIS.2013.24

  • Executive Summary: Application Lifecycle in EAM

    #architecture #clarity #velocity #direction

    Das Application Lifecycle Management (ALM) in LeanIX ist ein zentraler Bestandteil des Enterprise Architecture Managements (EAM). Es ermöglicht Unternehmen, den gesamten Lebenszyklus ihrer Anwendungen effektiv zu verwalten und zu optimieren. Dieser Prozess umfasst alle Phasen von der Planung und Entwicklung über den Betrieb bis hin zur Ablösung von Applikationen.

    LeanIX bietet als EAM-Tool umfangreiche Funktionen, um Application Owner bei der Verwaltung ihrer Anwendungen zu unterstützen. Es ermöglicht eine ganzheitliche Sicht auf die IT-Landschaft und hilft dabei, Abhängigkeiten, Risiken und Optimierungspotenziale zu identifizieren.

    In diesem Blog werden wir zunächst die Bedeutung des ALM für Application Owner erläutern und anschließend konkrete Verbesserungsvorschläge für die Umsetzung in LeanIX präsentieren. Ziel ist es, die Effizienz und Effektivität des Application Lifecycle Managements zu steigern und somit einen größeren Mehrwert für das Unternehmen zu schaffen.

    Sensibilisierung der Application Owner

    Um Application Owner, die mehrere Applikationen verantworten und der Meinung sind, dass sie LeanIX nicht benötigen, von der Wichtigkeit von EAM im allgemeines und des Tools im besonderen zu überzeugen, können folgende Testfragen mit Fokus auf Architektur, Prozesse und Daten gestellt werden:

    a) Architektur-bezogene Fragen:

    • Wie schnell können Sie herausfinden, welche Ihrer Applikationen von einer geplanten Infrastrukturänderung betroffen wären?
    • Welche Ihrer Applikationen nutzen veraltete Technologien und müssen in naher Zukunft modernisiert werden?

    b) Prozess-bezogene Fragen:

    • Wie würden Sie den Einfluss einer Ihrer Applikationen auf die gesamte Wertschöpfungskette des Unternehmens beschreiben?
    • Bei einem Ausfall einer Ihrer Applikationen: Wie schnell können Sie alle betroffenen Geschäftsprozesse identifizieren?

    c) Daten-bezogene Fragen:

    • Können Sie für jede Ihrer Applikationen die verarbeiteten Datenentitäten und deren Datenflüsse skizzieren?
    • Können Sie ad hoc angeben, welche Ihrer Applikationen personenbezogene Daten verarbeiten und wie diese geschützt werden?

    d) Übergreifende Fragen:

    • Wie schnell können Sie bei einer Audit-Anfrage alle relevanten Informationen zu Ihren Applikationen zusammenstellen?
    • Wie stellen Sie sicher, dass alle Stakeholder stets über den aktuellen Stand und geplante Änderungen Ihrer Applikationen informiert sind?

    Verbesserungsvorschläge für Application Lifecycle Management in LeanIX

    Um Application Owner bei der Pflege ihrer Applikationen in LeanIX zu unterstützen, die Unternehmensarchitektur stärker am Business auszurichten und den Zusammenhang zum Datenmanagement zu nutzen, schlage ich folgende konkrete Aktivitäten als Diskussionsgrundlage vor:

    1. Schulungen und Workshops für Application Owner:
      • Organisieren Sie regelmäßige Schulungen zu LeanIX und Best Practices
      • Führen Sie Workshops durch, die den Zusammenhang zwischen Applikationen, Geschäftsprozessen und Daten verdeutlichen
      • Erstellen Sie praxisnahe Leitfäden und Checklisten für die Pflege von Applikationen in LeanIX in einem leicht zugänglichen Werkzeug wie z. B. Confluence
      • Erstellen Sie LeanIX-Surveys, über die Application Owner relevante Informationen einfach durch Beantwortung zugeschnittener Fragenkataloge vornehmen können
    2. Prozessorientierte Modellierung in LeanIX:
      • Implementieren Sie eine prozessorientierte Sicht in LeanIX
      • Verknüpfen Sie Applikationen mit den unterstützten Geschäftsprozessen
      • Visualisieren Sie den Beitrag jeder Applikation zur Wertschöpfungskette
    3. Integration von Datenmanagement-Aspekten:
      • Erweitern Sie das LeanIX-Metamodell um relevante Datenmanagement-Attribute
      • Verknüpfen Sie Applikationen mit den von ihnen verarbeiteten Datenentitäten
      • Implementieren Sie Datenflussdiagramme, die den Zusammenhang zwischen Applikationen und Daten zeigen
    4. Automatisierung und Integration:
      • Implementieren Sie Schnittstellen zwischen LeanIX und anderen relevanten Tools (z.B. BPM, Data Management Platform)
      • Automatisieren Sie die Aktualisierung von Basis-Informationen in LeanIX
      • Erstellen Sie Dashboards, die den Pflegestatus und die Datenqualität visualisieren
    5. Governance und Anreize:
      • Etablieren Sie klare Verantwortlichkeiten und SLAs für die Pflege von Applikationsinformationen
      • Implementieren Sie ein Belohnungssystem für Application Owner, die ihre Daten aktuell halten
      • Führen Sie regelmäßige Reviews der Applikationslandschaft durch
    6. Daten-Governance Integration:
      • Verknüpfen Sie Daten-Governance-Rollen (z.B. Data Owner, Data Steward) mit den entsprechenden Applikationen in LeanIX
      • Implementieren Sie Attribute für Datenklassifizierung und Datenschutzanforderungen bei Applikationen
      • Erstellen Sie Reports, die Daten-Governance-Aspekte über die gesamte Applikationslandschaft hinweg zeigen
    7. Kontinuierliche Verbesserung:
      • Etablieren Sie einen regelmäßigen Feedback-Prozess mit Application Ownern
      • Analysieren Sie Nutzungsmuster in LeanIX, um Verbesserungspotenziale zu identifizieren
      • Passen Sie das Metamodell und die Prozesse basierend auf dem Feedback kontinuierlich an
  • Executive Summary: Data Strategy 2.0

    #architecture #clarity #velocity #direction 

    In my last post Executive Summary: Strategic Data Science, I have summarized what Data Science is and what it consist of. Moreover, you need to deploy a strategy that helps you manage transformation to a data-driven business.

    Today, you will see that a strategy for data science can be handled just like any data strategy. And if you already have a data strategy deployed, e.g. as part of your governance or architecture initiative, then you will see why and where it is affected.

    As written in Executive Summary on EA Maturity, having a map knowing where you are and where you want to go to helps a lot in finding a way.

    Maturity

    If you are working with maturity models, you typically do this on a yearly basis. For chosen capabilities you identify current vs target maturity e.g. ranked from level 1 to 5.

    The first thing you need to understand is that introducing data science for the first time reduces your overall maturity at once. Why is that?

    Maturity is measured in terms of capabilities. And if you take a look into those capabilities you will find that you need to adapt them. There typically are a dozen or so like vision, objectives, people, processes, policies, master data management, business intelligence, big data analytics, data quality, data modeling, data asset planning, data integration, and metadata management.

    I will pick only a few as examples to make things clear. Let’s pick vision, people, and technology.

    Selected Capabilities for Explaining Maturity of Data Strategy

    Vision

    Say you have a vision like: “Providing customer care that is so satisfying, that every customer comes back to us with a smile”. That’s a very strong statement, but how about: “Keeping every customer satisfied by solving all problems before complaining”. Wow, even stronger. It is possible because Data Science allows you to predict what others can’t.

    People

    Probably, you already have a data architect. But, the classic data architect focuses on architecture, technology, and governance issues. This is OK, but you also need some data advisor focusing on unseen solutions for the business. Someone telling you to combine customer data with product usage data increasing your sales. And perhaps even telling you from which of your precious data you can create completely new data-driven products you can sell.

    Technology

    Probably, you also have an inventory telling you which data sources are used in your applications. Adding Data Science as rapidly growing discipline to the equation, you may find that you will have to revise your technology portfolio. It is rapidly growing and changing and, therefore, needs to be governed to a certain amount (freedom vs standardization).

    Following list shows selected technologies that are most often used in Data Science (ranked from left to right).

    • Programming Languages: SQL, Python, R
    • Relational Databases: MySQL, MS SQL Server, PostgreSQL
    • Big data platforms: Spark, Hive, MongoDB
    • Spreadsheets, BI, Reporting: Excel, Power BI, QlikView

    Moreover, there is a shift in who is actually using these technologies like Leadership, Finance, Sales, and Marketing. And more often without dedicated enterprise applications because data analysis is very dynamic and has a lot of try and error to it.

    Conclusion

    From these view capabilities out of a dozen+ it has become clear that Data Science Strategy easily fits into an overall Data Strategy. There is no need to reinvent the wheel. Instead, adapt your existing or favorite Data Strategy to incorparate Data Science.

  • Executive Summary: Strategic Data Science

    #architecture #clarity #velocity #direction #data

    If you as C-level are already using or plan to use data science you probably pursue the goal to increase your market share by making predictions that others can’t. You might think that there is no need for strategic management of data science. Actually, that’s as far from the truth as it can get. But, why is that? It is because there may be a lot of complexity indicated by the figure below and discussed in the following.

    The Flower of Complexity

    Definition

    First, let’s take a look into the definition

    Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data.

    source: wikipedia

    There are a lot of keywords in this rather short definition that should raise your eyebrows: inter-disciplinary, methods, processes, algorithms, systems, many.

    Basic Method

    Now, let’s pick a keyword from above and dig deeper e.g. recall the basic scientific method:

    1. Find a question
    2. Collect data
    3. Prepare data for analysis
    4. Create model
    5. Evaluate model
    6. Deploy model

    Doesn’t sound overly complex, but let’s finally deep dive. Which of those phases do you think is responsible for most of the effort spent? It is the step that roughly amounts to 80% of the overall process! There are even several synonyms for it like data munging, data wrangling, and data cleaning or cleansing. You guessed right, it is phase three. Its complexity is mainly driven by the number of different data sources, the number and complexity of involved data structures, and sometimes also mixed with unstructured data.

    Conclusion

    We can go on like this for a while, but I do not want to bore you with the details. So, let’s summarize first and I will deliver a compressed list of further aspects afterward which you may take note of or skip altogether.

    Forecast:
    If you do not strategically manage data science in your enterprise you may expect another area of proliferation which you should urgently avoid!

    Solution:
    I can help you with that. My approach is to combine data science with an architecture development cycle. Proven methods and tools will help you to master the inherent complexity and get the most out of data science for your business. You can leave the details to me.

    The Details

    Data science as a discipline delivers methods like the one we have discussed above. Yet, it also

    • combines subjects like
      • computer science
      • math & statistics
      • business domain knowledge
    • involves interdisciplinary roles like
      • Data Engineer
      • Data Scientist
      • Business Analyst
      • Product Owner / Project Manager
      • Developer
      • User Interface Specialist
    • implies many skills like
      • programming
      • working with data
      • descriptive statistics
      • data visualization
      • statistical modeling
      • handling Big Data
      • machine learning
      • deploying to production
    • is done with many tools like
      (only top 3-4 in each category named here)
      • programming languages
        • SQL
        • Python
        • R
      • databases
        • MySQL
        • MS SQL Server
        • PostgreSQL
        • Oracle
      • Big data platforms
        • Spark
        • Hive
        • MongoDB
        • Amazon Redshift
      • Spreadsheets, BI, Reporting
        • Excel
        • Power BI
        • QlikView

    And the list is growing steadily. A little exhausting, isn’t it? At this point latest you should be convinced that data science needs strategic attention.

  • Executive Summary on the Ivory Tower Syndrome

    #architecture #clarity #velocity #direction

    This is the executive summary of last week’s post, Oh please, get down from the ivory tower and get something done!

    EA Ivory Tower Syndrome

    When does it happen?

    The Ivory Tower Syndrome describes an often seen drift of EA initiatives dealing mostly with themselves focusing solely on strategic management while already having lost traction and therefore acceptance by the ground force.

    Why does it happen?

    Some EA initiatives tend to focus more on strategic reporting to upper levels and try to govern by code of law only. But, the ground force in terms of actual projects and product development, needs support for their huge amount of concise work that has to be done with granted budget and milestones. In a law-only approach they feel like not being supported but only punished (missing the carrot in “carrot and stick”).

    A common misconception of EA initiatives of companies is that they can work like political government and urban planning. But, as analogy of how e.g. power grids are managed (or water grids, gas grids, metro systems, and so on), companies often only provide a fraction of needed services.

    How to avoid and improve?

    • An adequate EA authority shall be balanced with a compact code of law.
    • The EA authority shall collaborate with other authorities like revision and portfolio manager.
    • Do not be jurisdictional because companies have no jurisdiction compared to politics and urban planning.
    • Align objectives of managers with your EA strategy or vice versa.
    • Implement cost saving services for each of your laws (get the tiger some teeth).
    • Include projects and product development in a community. Communicate outstanding achievements. Recognized employees drive acceptance for you!