Linking source code to software applications (entrypoints and service endpoints aka SaaS) with regard for interface type · Issue #271 · codemeta/codemeta · GitHub
You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Linking source code to software applications (entrypoints and service endpoints aka SaaS) with regard for interface type
The aim of this proposal is to:
explicitly specify the interface type(s) provided by software
make an explicit distinction and explicit link between software and software as a service
allow linking to software instances (services) from the source code metadata
relate SoftwareSourceCode and SoftwareApplication in both directions
make 'entry points' and 'service endpoints' explicit
settle some ambigious terms
Linking source code to application entrypoint and service endpoints
In #198, #229 and #246 it was discussed and subsequently decided to add hasSourceCode to schema.org and codemeta; a good idea. I would propose we
also add a property that is the exact and unambigious reverse of this. I suggest providesApplication = @reverse hasSourceCode.. There is also targetProduct (#267) which has the same
domain and range, but there seems to be a lot of confusion what targetProduct means exactly, schema.org defines it as: "Target Operating System / Product to which the code applies. If applies to several versions, just the product name can be used.")
. It is too vaguely defined and there is conflicting information in #267, #246 and #198.
Various aspects of what I propose here affect schema.org directly but I thought
it better to pass this through the codemeta community first.
The providesApplication property would allow explicitly linking from the
source code metadata to software applications. This make two things possible:
This would provide a better means of expressing entry points for software, as I proposed earlier in Entry points (or API endpoints) extension for codemeta? #183 back in 2018. An entry point here is simply defined
as an executable provided by a source code, each of which can be considered a schema:SoftwareApplication in their own right.
Linking source code to service instances where the application is running (service endpoints). Each typically associated with an URL. Here the range would be:
* schema:WebAPI (as proposed in Add "WebAPI" type schemaorg/schemaorg#1423 and worked out in WebAPI update based on #1423 schemaorg/schemaorg#2635) - Emphasis here is on the machine interface. The existing schema:EntryPoint also has a place in what they proposed here. Their proposal also covers linking to formal specifications (like OpenAPI/swagger).
* schema:WebApplication - Emphasis here is on the human interface (web UI).
* schema:WebPage - Emphasis here is on the human interface.
* The domain of schema:hasSourceCode would also need to be extended to included all these three.
I think we can use providesApplication to cover both cases, but alternatively we could envision two properties (providesApplication vs providesService?)
Interface type
A software application offers one or more interfaces through which users or
machines can interact with it. I'd like to make this information explicit. When
using providesApplication with schema:WebAPI/WebApplication/WebPage it
is already implied. For the more generic schema:SoftwareApplication it is
not. The specific types schema:MobileApplication, schema:VideoGame and
aforementioned schema:WebApplication already exist, but other interface
types are not covered yet. We could extend these with:
CommandLineApplication (command line interfaces)
DesktopApplication (Desktop GUIs)
TerminalApplication (Text UIs, think of vim,mutt and ncurses-based tools etc)
SoftwareDaemon (Software running as a daemon providing some kind of service over a network or local socket, think e.g. of ntpd, crond), this would be more generic than WebApplication (or WebAPI).
SoftwareLibrary (APIs, think of libraries, either in the form of shared-objects/dll/dylib or in the form of modules for interpreted languages like Python)
More specific types can be envisioned (relates to #256):
NotebookApplication (more specific form of WebApplication) - For Jupyter Notebooks and comparable technologies. Characterised by a mixture of text and code, often used in data science. May or may not be tied to a specific url where an intertactive instance is available (e.g link to binder/collab).
SoftwareImage - A software application in some kind of image form (such as an OCI container (e.g. Docker)), that typically ships the software with all its immediate dependency context. May or may not be tied to a specific url where the image is obtained (e.g. Docker Hub). Here the provided interface is relevant for operators (in a DevOps context) seeking to deploy the software in an infrastructure.
SoftwarePackage - The Software in some packaged form (e.g. for a particular linux distribution, homebrew, a Python wheel, etc). The difference between this and SoftwareImage would be that this packages only the software, and not its dependency context, the dependency context is assumed to be explicitly expressed in the package but is obtained from other packages within the same packaging context (whatever package distribution method that may be).
Alternatively, we could have an interfaceType property like I suggested in #183, but as it seems there is already precedence in schema.org for doing it
with Types, so that might be the best way to follow.
An important point to consider is that a software application, even implemented
in a single executable, may provide multiple types. But assigning multiple
types is not an obstacle, correct me if I'm wrong, so that should be covered already.
Executable Name
In order to express entry points explicitly, it's important to list the exact
executable names, which are not necessarily identical to the name.
Alternatively, one may argue that schema:identifier suffices for this.
There is already a schema:executableLibraryName property (used in a
documentation context on APIReference). That could be reused for the
proposed SoftwareLibrary. But a more generic executableName would need
to be introduced for the others, and there's no real reason not to use that for
libraries as well. The executableName would be defined that what is
runnable (within a certain runtimePlatform context), it should not contain
platform-specific extensions like .exe,.so,.dylib,.dll but just
the name portion. For software libraries for platform like Python it would
correspond to the top-level module name that can be imported.
Such a property may also make sense directly on SoftwareSourceCode,
allowing for a more succint expression rather than needing to go via providesApplication and the corresponding SoftwareApplication-subtypes.
Example
Consider the following example of a SoftwareSourceCode instance where the
codebase provides various interface types. (This software actually exists
though in reality it's not a single codebase that provides all these interfaces,
it's split into multiple repositories, but it would be conceivable someone does
it like this):
I've tried to tie together some existing loose ends in this proposal, reusing
as much of the existing codemeta/schema vocabulary as possible and linking with
other existing proposals, keeping the amount of newly introduced vocabulary to
a minimum.
What this subsequently allows is expressing software metadata from multiple
perspectives, one may start with a codemeta.json and the source code as a
basis and produce a complete tree of software applications and service
instances that are provided by the source code. In a research context, there's
often a single institute bringing a web-demo of a certain research sofware
online, possibly for demo purposes. It makes sense to be able accommodate this
metadata directly from the codemeta.json in the source code root.
Moreover, this enables conversion of entrypoint metadata already present in
e.g. Python setup.py, to codemeta/schema.
For those who take the other perspective and express metadata as WebAPI or WebPage or WebApplication first and foremost, this provides the means
to explicitly link it to the source code.
Apologies for the long post but I wanted to make sure to sketch a complete
picture, I'd be appreciative of any feedback. Most of this is probably more for
schema.org than codemeta but I wanted to discuss it here first and see what you
suggest. I'd also like to poke @dgarijo in this because I see he's been doing
some excellent work on formalizing things in the Software Description Ontology
and we have some overlap there (this touches upon #229 and #256).
Linking source code to software applications (entrypoints and service endpoints aka SaaS) with regard for interface type
The aim of this proposal is to:
Linking source code to application entrypoint and service endpoints
In #198, #229 and #246 it was discussed and subsequently decided to add
hasSourceCodeto schema.org and codemeta; a good idea. I would propose wealso add a property that is the exact and unambigious reverse of this. I suggest
providesApplication=@reverse hasSourceCode.. There is alsotargetProduct(#267) which has the samedomain and range, but there seems to be a lot of confusion what
targetProductmeans exactly, schema.org defines it as: "Target Operating System / Product to which the code applies. If applies to several versions, just the product name can be used."). It is too vaguely defined and there is conflicting information in #267, #246 and #198.
Various aspects of what I propose here affect schema.org directly but I thought
it better to pass this through the codemeta community first.
The
providesApplicationproperty would allow explicitly linking from thesource code metadata to software applications. This make two things possible:
as an executable provided by a source code, each of which can be considered a
schema:SoftwareApplicationin their own right.*
schema:WebAPI(as proposed in Add "WebAPI" type schemaorg/schemaorg#1423 and worked out in WebAPI update based on #1423 schemaorg/schemaorg#2635) - Emphasis here is on the machine interface. The existingschema:EntryPointalso has a place in what they proposed here. Their proposal also covers linking to formal specifications (like OpenAPI/swagger).*
schema:WebApplication- Emphasis here is on the human interface (web UI).*
schema:WebPage- Emphasis here is on the human interface.* The domain of
schema:hasSourceCodewould also need to be extended to included all these three.I think we can use
providesApplicationto cover both cases, but alternatively we could envision two properties (providesApplicationvsprovidesService?)Interface type
A software application offers one or more interfaces through which users or
machines can interact with it. I'd like to make this information explicit. When
using
providesApplicationwithschema:WebAPI/WebApplication/WebPageitis already implied. For the more generic
schema:SoftwareApplicationit isnot. The specific types
schema:MobileApplication,schema:VideoGameandaforementioned
schema:WebApplicationalready exist, but other interfacetypes are not covered yet. We could extend these with:
CommandLineApplication(command line interfaces)DesktopApplication(Desktop GUIs)TerminalApplication(Text UIs, think of vim,mutt and ncurses-based tools etc)SoftwareDaemon(Software running as a daemon providing some kind of service over a network or local socket, think e.g. of ntpd, crond), this would be more generic thanWebApplication(orWebAPI).SoftwareLibrary(APIs, think of libraries, either in the form of shared-objects/dll/dylib or in the form of modules for interpreted languages like Python)More specific types can be envisioned (relates to #256):
NotebookApplication(more specific form ofWebApplication) - For Jupyter Notebooks and comparable technologies. Characterised by a mixture of text and code, often used in data science. May or may not be tied to a specific url where an intertactive instance is available (e.g link to binder/collab).SoftwareImage- A software application in some kind of image form (such as an OCI container (e.g. Docker)), that typically ships the software with all its immediate dependency context. May or may not be tied to a specific url where the image is obtained (e.g. Docker Hub). Here the provided interface is relevant for operators (in a DevOps context) seeking to deploy the software in an infrastructure.SoftwarePackage- The Software in some packaged form (e.g. for a particular linux distribution, homebrew, a Python wheel, etc). The difference between this andSoftwareImagewould be that this packages only the software, and not its dependency context, the dependency context is assumed to be explicitly expressed in the package but is obtained from other packages within the same packaging context (whatever package distribution method that may be).Alternatively, we could have an
interfaceTypeproperty like I suggested in#183, but as it seems there is already precedence in schema.org for doing it
with Types, so that might be the best way to follow.
An important point to consider is that a software application, even implemented
in a single executable, may provide multiple types. But assigning multiple
types is not an obstacle, correct me if I'm wrong, so that should be covered already.
Executable Name
In order to express entry points explicitly, it's important to list the exact
executable names, which are not necessarily identical to the
name.Alternatively, one may argue that
schema:identifiersuffices for this.There is already a
schema:executableLibraryNameproperty (used in adocumentation context on
APIReference). That could be reused for theproposed
SoftwareLibrary. But a more genericexecutableNamewould needto be introduced for the others, and there's no real reason not to use that for
libraries as well. The
executableNamewould be defined that what isrunnable (within a certain runtimePlatform context), it should not contain
platform-specific extensions like
.exe,.so,.dylib,.dllbut justthe name portion. For software libraries for platform like Python it would
correspond to the top-level module name that can be imported.
Such a property may also make sense directly on
SoftwareSourceCode,allowing for a more succint expression rather than needing to go via
providesApplicationand the correspondingSoftwareApplication-subtypes.Example
Consider the following example of a
SoftwareSourceCodeinstance where thecodebase provides various interface types. (This software actually exists
though in reality it's not a single codebase that provides all these interfaces,
it's split into multiple repositories, but it would be conceivable someone does
it like this):
{ "@type": "SoftwareSourceCode", "name": "Frog", "codeRepository": "https://github.com/LanguageMachines/frog", ..., "providesApplication": [ { "type": "CommandLineApplication", "executableName": "frog", "name": "Frog", "runtimePlatform": "Linux" }, { "type": "SoftwareLibrary", "executableName": "libfrog", "name": "Frog Library", "runtimePlatform": "Linux" }, { "type": "SoftwareLibrary", "executableName": "frog", "name": "Frog Python Binding", "runtimePlatform": "Python" }, { "type": "WebAPI", "provider": "Radboud Universiteit Nijmegen", "endpointUrl": "https://webservices.cls.ru.nl/frog", "endpointDescription": "https://webservices.cls.ru.nl/frog", "conformsTo": "https://clam.readthedocs.io/en/stable/", "documentation": "https://webservices.cls.ru.nl/frog/info", "contentType": "application/xml" }, { "type": "WebApplication", "executableName": "frog-service", "provider": "Radboud Universiteit Nijmegen", "url": "https://webservices.cls.ru.nl/frog" } ] }Conclusion
I've tried to tie together some existing loose ends in this proposal, reusing
as much of the existing codemeta/schema vocabulary as possible and linking with
other existing proposals, keeping the amount of newly introduced vocabulary to
a minimum.
What this subsequently allows is expressing software metadata from multiple
perspectives, one may start with a
codemeta.jsonand the source code as abasis and produce a complete tree of software applications and service
instances that are provided by the source code. In a research context, there's
often a single institute bringing a web-demo of a certain research sofware
online, possibly for demo purposes. It makes sense to be able accommodate this
metadata directly from the
codemeta.jsonin the source code root.Moreover, this enables conversion of entrypoint metadata already present in
e.g. Python
setup.py, to codemeta/schema.For those who take the other perspective and express metadata as
WebAPIorWebPageorWebApplicationfirst and foremost, this provides the meansto explicitly link it to the source code.
Apologies for the long post but I wanted to make sure to sketch a complete
picture, I'd be appreciative of any feedback. Most of this is probably more for
schema.org than codemeta but I wanted to discuss it here first and see what you
suggest. I'd also like to poke @dgarijo in this because I see he's been doing
some excellent work on formalizing things in the Software Description Ontology
and we have some overlap there (this touches upon #229 and #256).