Tag: strategy

  • Classifying Language Models along Autonomy & Trust Levels

    The Problem

    Language models are everywhere now. People praise them, but also complain about responses—unreliable, hallucination, cannot let it work alone, and so on. These systems, capable of understanding and generating human-like text, are often called copilots—a term borrowed from aerospace or car racing. That term indicates their main expected role being support for the pilot.

    But how do we actually classify what these models can do? And more importantly, how much can we trust them?

    A Hybrid Classification Framework

    Drawing inspiration from the SAE levels of driving automation and grounded in human-computer interaction research on trust in automation, we propose a two-dimensional framework for classifying language models:

    1. Operational Autonomy – adapted from SAE Levels (0–5): What can the model do on its own?
    2. Cognitive Trust and Delegation – how much mental effort does the user expend, and how much responsibility is delegated?

    Each level in the chart below reflects both dimensions.

    LevelAutonomy DescriptionTrust/Delegation Role
    0 – Basic SupportPassive tools like spellcheckers; no real autonomyNo Trust: User must fully control and interpret everything
    1 – Assisted GenerationSuggests words or phrases (autocomplete); constant oversight neededSuggestive Aid: User supervises and approves each suggestion
    2 – Semi-Autonomous Text ProductionGenerates coherent content from prompts (emails, outlines); needs close supervisionCo-Creator: User relies in low-stakes tasks but reviews all outputs
    3 – Context-Aware AssistanceCan handle structured tasks (e.g., medical summaries); users remain alertDelegate: User lets go during routine tasks but monitors for failure
    4 – Fully Autonomous Within DomainsWorks independently in narrow contexts (e.g., customer service bot)Advisor: Trusted within scope; user rarely intervenes
    5 – General Language AgentHypothetical general-purpose assistant capable across domains without oversightAgent: Fully trusted to operate independently and responsibly

    Why SAE Levels Make Sense

    While not acting in the physical world, it makes perfect sense to compare language models to autonomous vehicles in terms of their capabilities and limitations. The SAE classification helps clarify expectations, safety considerations, and technological milestones.

    Let’s first briefly revisit what each SAE level entails for automobiles:

    • Level 0 (No Automation): The human driver does everything; no automation features assist with driving beyond basic warnings.
    • Level 1 (Driver Assistance): The vehicle offers assistance with either steering or acceleration/deceleration but requires constant oversight.
    • Level 2 (Partial Automation): The system can manage both steering and acceleration but still requires the human to monitor closely.
    • Level 3 (Conditional Automation): The vehicle handles all aspects of driving under specific conditions; the human must be ready to intervene if necessary.
    • Level 4 (High Automation): The car can operate independently within designated areas or conditions without human input.
    • Level 5 (Full Automation): Complete autonomy in all environments—no human intervention needed.

    Adapting Levels to Language Models

    Level 0: Basic Support

    At this foundational level, language models serve as simple tools—spell checkers or basic chatbots—that provide minimal assistance without any real understanding or autonomy. They do not generate original content on their own but act as aids for humans who make all decisions.

    Example: Elementary grammar correction programs that flag mistakes but don’t suggest nuanced rewrites.

    Level 1: Assisted Generation

    Moving up one step, some language models begin offering suggestions based on partial input. For example, autocomplete functions in email clients that predict next words or phrases fall into this category—they assist but require constant supervision from users who must review outputs before accepting them.

    Example: Gmail’s smart compose feature.

    Level 2: Semi-Autonomous Text Production

    At this stage, models can generate longer stretches of coherent text when given prompts—think about AI tools that draft emails or outline articles—but they still demand continuous oversight. Users need to supervise outputs actively because errors such as factual inaccuracies or inappropriate tone remain common pitfalls.

    Example: ChatGPT generating email drafts or article outlines.

    Level 3: Context-Aware Assistance

    Now we reach an intriguing analogy with conditional automation—where AI systems can handle complex tasks within certain constraints yet require humans to step back temporarily while remaining alert for potential issues. Large language models operating at this level might manage summarization tasks under specific domains (e.g., medical summaries) but could falter outside their trained scope.

    Example: Medical AI assistants that can summarize patient records but require doctor oversight.

    Level 4: Fully Autonomous Within Domains

    Imagine an AI-powered assistant capable of managing conversations entirely within predefined contexts—say customer service bots handling standard inquiries autonomously within specified industries—but unable beyond those limits without retraining or manual intervention.

    Example: Customer service chatbots for specific industries like banking or retail.

    Level 5: Fully Autonomous General Language Understanding

    Envisioning true “full autonomy” for language models means creating systems that understand context deeply across countless topics and produce accurate responses seamlessly everywhere—all without prompting from humans if desired. While such systems remain theoretical today, research aims toward developing general-purpose AI assistants capable not only of conversing fluently across domains but doing so responsibly without oversight.

    Example: Theoretical future AI systems that could operate across all domains without human oversight.

    Current State and Implications

    Now that we have a clear classification framework, let’s examine where we stand today and what this means for practical applications.

    What does this classification tell us about our current standing? Most contemporary large-scale language models sit somewhere around Levels 2 or early-Level 3—they generate impressive content when given prompts yet still struggle with consistency outside narrow contexts and require vigilant supervision by humans who evaluate accuracy critically.

    However, there’s an important limitation to the SAE analogy that we need to address.

    The Trust Dimension

    While the SAE levels offer a useful metaphor for understanding increasing autonomy, they aren’t a perfect fit for language models because:

    • Language models don’t act in the physical world themselves—humans interpret and act on their outputs
    • Risk and impact in NLP are mediated by human cognition and behavior, unlike the immediate physical risks of self-driving cars
    • Autonomy in NLP often deals more with semantic understanding, trustworthiness, context handling, and ethical alignment than sensor-actuator loops

    Therefore, I also propose a mapping of the SAE levels to trust levels taking into account cognitive load and responsibility:

    • Level 0: No trust: tool offers isolated corrections, requires full user oversight (spellcheck)
    • Level 1: Suggestive aid: user must review and approve every suggestion (autocomplete)
    • Level 2: Co-creator: user maintains active oversight, only defers in low-stakes contexts (drafting emails)
    • Level 3: Delegate: user maintains regular oversight with frequent spot checks and validation (10-20% review)
    • Level 4: Advisor: user maintains strategic oversight with periodic reviews (5-10% audit), especially for high-stakes outputs
    • Level 5: Agent: user maintains governance oversight with systematic audits (1-5% review) despite autonomous operation

    Practical Implications

    Classifying language models along SAE-like levels provides practical benefits:

    1. Common vocabulary for developers, researchers, policymakers, and end-users
    2. Realistic expectations about capabilities—the difference between tools assisting writing versus fully automating complex decision-making processes
    3. Regulatory guidance for ensuring safe deployment at each stage
    4. Effort per level is increasing, probably exponentially

    Design Priorities

    It’s vital not simply to categorize these technologies for academic interest but also because such clarity informs design priorities:

    • Should future efforts focus on improving reliability before granting more independence?
    • How do safety concerns evolve as we move up each level?
    • What ethical considerations arise when deploying increasingly autonomous NLP systems?

    Each incremental step toward higher levels demands careful consideration regarding:

    • Transparency: Can users understand when they’re interacting with an assistant versus an agent?
    • Accountability: Who bears responsibility if an AI-generated statement causes harm?

    Conclusion

    Applying SAE-level classifications offers more than just terminology—it provides a roadmap illustrating how far we’ve come and how much further we need to go in developing intelligent language systems capable not only of mimicking human conversation but doing so responsibly across diverse environments.

    Recognizing where current technology resides on this spectrum enables us all—from engineers designing smarter assistants to regulators crafting informed policies—to make conscious choices grounded in realistic assessments rather than hype or fear.

    As artificial intelligence continues its ascent along these levels—from rudimentary support towards full autonomy—the journey will demand ongoing collaboration among technologists, ethicists, policymakers, and ultimately society itself to ensure these powerful tools serve humanity’s best interests every step along the way.

    References

    SAE J3016™. “Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles.” First published: 2014. Most recent version (as of 2024): SAE J3016_202104 (April 2021). 🔗 https://www.sae.org/standards/content/j3016_202104/

    Hoffman, R. R., Johnson, M., Bradshaw, J. M., & Underbrink, A. (2013). “Trust in automation.” IEEE Intelligent Systems, 28(1), 84–88. DOI: 10.1109/MIS.2013.24

  • Executive Summary: Application Lifecycle in EAM

    #architecture #clarity #velocity #direction

    Das Application Lifecycle Management (ALM) in LeanIX ist ein zentraler Bestandteil des Enterprise Architecture Managements (EAM). Es ermöglicht Unternehmen, den gesamten Lebenszyklus ihrer Anwendungen effektiv zu verwalten und zu optimieren. Dieser Prozess umfasst alle Phasen von der Planung und Entwicklung über den Betrieb bis hin zur Ablösung von Applikationen.

    LeanIX bietet als EAM-Tool umfangreiche Funktionen, um Application Owner bei der Verwaltung ihrer Anwendungen zu unterstützen. Es ermöglicht eine ganzheitliche Sicht auf die IT-Landschaft und hilft dabei, Abhängigkeiten, Risiken und Optimierungspotenziale zu identifizieren.

    In diesem Blog werden wir zunächst die Bedeutung des ALM für Application Owner erläutern und anschließend konkrete Verbesserungsvorschläge für die Umsetzung in LeanIX präsentieren. Ziel ist es, die Effizienz und Effektivität des Application Lifecycle Managements zu steigern und somit einen größeren Mehrwert für das Unternehmen zu schaffen.

    Sensibilisierung der Application Owner

    Um Application Owner, die mehrere Applikationen verantworten und der Meinung sind, dass sie LeanIX nicht benötigen, von der Wichtigkeit von EAM im allgemeines und des Tools im besonderen zu überzeugen, können folgende Testfragen mit Fokus auf Architektur, Prozesse und Daten gestellt werden:

    a) Architektur-bezogene Fragen:

    • Wie schnell können Sie herausfinden, welche Ihrer Applikationen von einer geplanten Infrastrukturänderung betroffen wären?
    • Welche Ihrer Applikationen nutzen veraltete Technologien und müssen in naher Zukunft modernisiert werden?

    b) Prozess-bezogene Fragen:

    • Wie würden Sie den Einfluss einer Ihrer Applikationen auf die gesamte Wertschöpfungskette des Unternehmens beschreiben?
    • Bei einem Ausfall einer Ihrer Applikationen: Wie schnell können Sie alle betroffenen Geschäftsprozesse identifizieren?

    c) Daten-bezogene Fragen:

    • Können Sie für jede Ihrer Applikationen die verarbeiteten Datenentitäten und deren Datenflüsse skizzieren?
    • Können Sie ad hoc angeben, welche Ihrer Applikationen personenbezogene Daten verarbeiten und wie diese geschützt werden?

    d) Übergreifende Fragen:

    • Wie schnell können Sie bei einer Audit-Anfrage alle relevanten Informationen zu Ihren Applikationen zusammenstellen?
    • Wie stellen Sie sicher, dass alle Stakeholder stets über den aktuellen Stand und geplante Änderungen Ihrer Applikationen informiert sind?

    Verbesserungsvorschläge für Application Lifecycle Management in LeanIX

    Um Application Owner bei der Pflege ihrer Applikationen in LeanIX zu unterstützen, die Unternehmensarchitektur stärker am Business auszurichten und den Zusammenhang zum Datenmanagement zu nutzen, schlage ich folgende konkrete Aktivitäten als Diskussionsgrundlage vor:

    1. Schulungen und Workshops für Application Owner:
      • Organisieren Sie regelmäßige Schulungen zu LeanIX und Best Practices
      • Führen Sie Workshops durch, die den Zusammenhang zwischen Applikationen, Geschäftsprozessen und Daten verdeutlichen
      • Erstellen Sie praxisnahe Leitfäden und Checklisten für die Pflege von Applikationen in LeanIX in einem leicht zugänglichen Werkzeug wie z. B. Confluence
      • Erstellen Sie LeanIX-Surveys, über die Application Owner relevante Informationen einfach durch Beantwortung zugeschnittener Fragenkataloge vornehmen können
    2. Prozessorientierte Modellierung in LeanIX:
      • Implementieren Sie eine prozessorientierte Sicht in LeanIX
      • Verknüpfen Sie Applikationen mit den unterstützten Geschäftsprozessen
      • Visualisieren Sie den Beitrag jeder Applikation zur Wertschöpfungskette
    3. Integration von Datenmanagement-Aspekten:
      • Erweitern Sie das LeanIX-Metamodell um relevante Datenmanagement-Attribute
      • Verknüpfen Sie Applikationen mit den von ihnen verarbeiteten Datenentitäten
      • Implementieren Sie Datenflussdiagramme, die den Zusammenhang zwischen Applikationen und Daten zeigen
    4. Automatisierung und Integration:
      • Implementieren Sie Schnittstellen zwischen LeanIX und anderen relevanten Tools (z.B. BPM, Data Management Platform)
      • Automatisieren Sie die Aktualisierung von Basis-Informationen in LeanIX
      • Erstellen Sie Dashboards, die den Pflegestatus und die Datenqualität visualisieren
    5. Governance und Anreize:
      • Etablieren Sie klare Verantwortlichkeiten und SLAs für die Pflege von Applikationsinformationen
      • Implementieren Sie ein Belohnungssystem für Application Owner, die ihre Daten aktuell halten
      • Führen Sie regelmäßige Reviews der Applikationslandschaft durch
    6. Daten-Governance Integration:
      • Verknüpfen Sie Daten-Governance-Rollen (z.B. Data Owner, Data Steward) mit den entsprechenden Applikationen in LeanIX
      • Implementieren Sie Attribute für Datenklassifizierung und Datenschutzanforderungen bei Applikationen
      • Erstellen Sie Reports, die Daten-Governance-Aspekte über die gesamte Applikationslandschaft hinweg zeigen
    7. Kontinuierliche Verbesserung:
      • Etablieren Sie einen regelmäßigen Feedback-Prozess mit Application Ownern
      • Analysieren Sie Nutzungsmuster in LeanIX, um Verbesserungspotenziale zu identifizieren
      • Passen Sie das Metamodell und die Prozesse basierend auf dem Feedback kontinuierlich an
  • Tool Styles for Architecture

    #architecture #clarity #velocity #direction #enterprise #modeling

    When doing enterprise architecture as well as systems engineering (or system architecture in detail) the question arises if there can be one meta model like Archimate and one tool that does it all.

    That means, supporting the strategic portfolio level (comparable to city planning) as well as the development-oriented system level (architecture for one building at a time).

    Roles vs Tool Styles

    A very important requirement if talking about ‘enterprise’ is easy access to the captured landscape and its building blocks if for business products, applications, data, or technology as well as blueprints, planned architectures, and governance. This access must be provided for a range of users comprised by many people in the enterprise very probably having various roles and skills.

    Since all this architectural information shall not only be consumed but also improved and maintained in a distributed fashion, it becomes clear that a tool focusing on diagram-first modeling style cannot be the answer no matter if based on UML, SysML, Archimate, or any other formal modeling language. The reason is that most of the users are not able to model and only a fraction can be taught due to the costs. Mostly, the focus lies on roles already having a certain skill set you can easily build upon, i.e., mostly roles matching the word ‘architect’. Prominent examples of such expert tools are MagicDraw (Cameo) by No Magic (now Dassault Systèmes), ARIS Architect by Software AG, Adonis by BOC, Innovator by MID, Enterprise Architect by Sparx, and others. But, as we will see, any of these can be part of your overall story.

    Since the average user needs a tool that makes live easier and not harder, most EA tool vendors have focused on an approach that is data-first ERP style – typically providing web-based access to collected portfolio information (products, applications, business objects, technologies, and such). That information is presented as profiles or sheets like for each application which can also be updated via data forms. Tools like Alfabet (planningIT) by Software AG, LeanIX by LeanIX, LUY by iteratec, ADOIT by BOC, and others follow this path. From the captured data, they automatically produce dynamic graphical or tabular reports. Some of these tools also support Archimate ranging from basic import over addition to their own metamodels to own metamodels based on Archimate.

    But why do EA tools always provide more than Archimate provides? This is because many important aspects in daily life are missing in Archimate like roles and permissions, multi-tenancy, life cycle information (planned, active, deactivated; generally state per date interval), portfolio planning capabilities (as-is, plan, to-be; the later with alternatives), tool integration features (requirements, publication, test management), and a lot more.

    Scaled Architecture

    On the other hand, the last decade has shown, that focusing only on the strategic portfolio level ignoring the reality on the ground easily leads to the ivory-tower syndrome producing badly accepted plans to change the IT landscape.

    In order to avoid this, it is important to couple portfolio data with refined software and hardware architectures. The portfolio acts as the common structure and its content had better reflect the software and hardware inside (compare with reporting). And that’s where the above mentioned architects come into play again. They can bridge this gap by drilling down deeper to system architecture level and even further.

    In that case, diagram-first modeling style tools for experts are more appropriate. As mentioned above, these are typically based on UML, SysML, Archimate, another modeling language, or even a combination of those. Modeling tools supporting Archimate as a dialect can make integration with enterprise architecture tools that are also supporting Archimate a little easier.

    Conclusion

    Being able to address both worlds is an important issue and not an easy task. Common meta models may help, but are not a must. More important is the ability to map high-level enterprise architecture blocks to medium-level system architecture content and that in turn to low-level system design content which can also partly be reverse engineered directly from running systems.

    There are also tools that address both tool styles like ADOIT or upcoming versions of Bpanda that might be helpful, too. Let’s call it hybrid data-diagram style. Again, it is not a must, especially if different tools are already set in the organization and shall be integrated. The options are ranging from built-in integration features like export/import capabilities to separate integration tools like Smartfacts which provides a portal merging data from different tools via OSLC or classic synchronization.

  • Service for You: Architecture Potential Analysis

    #architecture #clarity #velocity #direction 

    I will find out for you, what potential is hidden in your IT – on enterprise level, project level, and system level.

    • Easily see where your processes sluggishly overlap based on consistent as-is tabular and graphical data
    • Gain grip with stronger planning capabilities
    • Drive things forward with clarity, velocity, and direction
    • Replace verbose talking and shiny bubbles with clear facts, a common target, and the ability to deliver

    High quality. Fully independent. Absolutely loyal.

    Fixed price option.

  • Executive Summary: Data Strategy 2.0

    #architecture #clarity #velocity #direction 

    In my last post Executive Summary: Strategic Data Science, I have summarized what Data Science is and what it consist of. Moreover, you need to deploy a strategy that helps you manage transformation to a data-driven business.

    Today, you will see that a strategy for data science can be handled just like any data strategy. And if you already have a data strategy deployed, e.g. as part of your governance or architecture initiative, then you will see why and where it is affected.

    As written in Executive Summary on EA Maturity, having a map knowing where you are and where you want to go to helps a lot in finding a way.

    Maturity

    If you are working with maturity models, you typically do this on a yearly basis. For chosen capabilities you identify current vs target maturity e.g. ranked from level 1 to 5.

    The first thing you need to understand is that introducing data science for the first time reduces your overall maturity at once. Why is that?

    Maturity is measured in terms of capabilities. And if you take a look into those capabilities you will find that you need to adapt them. There typically are a dozen or so like vision, objectives, people, processes, policies, master data management, business intelligence, big data analytics, data quality, data modeling, data asset planning, data integration, and metadata management.

    I will pick only a few as examples to make things clear. Let’s pick vision, people, and technology.

    Selected Capabilities for Explaining Maturity of Data Strategy

    Vision

    Say you have a vision like: “Providing customer care that is so satisfying, that every customer comes back to us with a smile”. That’s a very strong statement, but how about: “Keeping every customer satisfied by solving all problems before complaining”. Wow, even stronger. It is possible because Data Science allows you to predict what others can’t.

    People

    Probably, you already have a data architect. But, the classic data architect focuses on architecture, technology, and governance issues. This is OK, but you also need some data advisor focusing on unseen solutions for the business. Someone telling you to combine customer data with product usage data increasing your sales. And perhaps even telling you from which of your precious data you can create completely new data-driven products you can sell.

    Technology

    Probably, you also have an inventory telling you which data sources are used in your applications. Adding Data Science as rapidly growing discipline to the equation, you may find that you will have to revise your technology portfolio. It is rapidly growing and changing and, therefore, needs to be governed to a certain amount (freedom vs standardization).

    Following list shows selected technologies that are most often used in Data Science (ranked from left to right).

    • Programming Languages: SQL, Python, R
    • Relational Databases: MySQL, MS SQL Server, PostgreSQL
    • Big data platforms: Spark, Hive, MongoDB
    • Spreadsheets, BI, Reporting: Excel, Power BI, QlikView

    Moreover, there is a shift in who is actually using these technologies like Leadership, Finance, Sales, and Marketing. And more often without dedicated enterprise applications because data analysis is very dynamic and has a lot of try and error to it.

    Conclusion

    From these view capabilities out of a dozen+ it has become clear that Data Science Strategy easily fits into an overall Data Strategy. There is no need to reinvent the wheel. Instead, adapt your existing or favorite Data Strategy to incorparate Data Science.