ePUB: The native digital format which scientific journals ignore

www.cygnusmind.com

ePUB: The native digital format which scientific journals ignore – CygnusMind´s Blog

by Cygnusmind.com

ARTICLE

When we talk about Internet in the academic and scientific publications, we are referring to their transitions through the digital world. We think about “natural” languages which are operated traditionally in the Web. It is here when we think of two basics: HMTL and XML. Paradoxically, these two languages are less used in the scientific journals in Latin America-, specifically those where the common digital publishing platform used is the Open Journal System.

The ePUB, as an acronym of Electronic Publication, is a responsible, resizable, readjustable, format of Open Source Code which allows to read texts, images and audio. The e PUB3 format was adopted by the International Digital Publishing Forum in 2011. Two year after, in 2013, it was taking as the standard format by the International Publishers Association.

One of the greatest advantages of the ePUB format is the portability and its responsible nature, it is to say, it can be read and utilized in any digital platform or device: personal computer, tablet or smartphone. The adaptability and assurance of its current use preserve this digital reading option for any technological platform for the future. Furthermore, due to portability, it may you to download in low size and offers you offline reading which turn it into a great promise for the scientific communication.

Behind the ePUB format exists a structure of information that could be founded as well as in XML format. Even, the standard tagging scientific articles in JATS such as well as XML allow to generate automatically the articles in ePUB format. Subsequently, the ePUB results a well reading format naturally for the scientific tagging XML JATS.

Those features make it one of the most useful formats for scientific literature reading.

If you are planning for making an ePUB, regardless of content, you must ask yourself if you comply with the following elements:

  • Logical framework and a suitable professional interface surfing site.
  • Consistent styles – it must exist a clearly difference and a hierarchy in the typography along the text –. [A consistent use and a correct hierarchy in the body might help totally to achieve an easy reading].
  • Suitable well-structured index.
  • A customizable format which allows to set up time, rotation, size and type font, color and surface change, hyphenation of words.
  • Appropriate and adequate metadata structure: the data must connect and inform to the user as well as allows to search engines to find the information: title, license, authors, editors, etc.
  • Images must not to be clears [Is important present responsible design and when zoom is made on, the definition must remain clear].
  • When you want to create an ePUB, you have to know how to exploit it: take advantage of its potentials like interactivity, video and audio addition, creation of hyperlinks to external sources of information; achieve your goal getting the whole possibilities that ePUB allows you.
  • A possibility to include JavaScript for adding elements to the interactivity of ePUB nature. – with JS you will add animations, forms, and even video games –.

However, is important to point out that this format always will need a third software to be read it. iBOOKs by Apple, Inc., Kobo, Adobe Digital Editions, Nook or Aldiko by Barnes and Noble in Android systems, among others. “View” or display of the text will always have to adapt to the reading software. We emphasize that ePUB format does not obey to its features, it does rather to the third reader software.

As an Open Source format offers to everyone a huge potential for designers of reading software who will surely continue to innovate in improving features to achieve the best experience for users with ePUB text format.

We can assure you that e PUB format possesses a great potential [ as we have appreciated it in books] for communicating messages powerfully into scientific community, but, for that, publishers must give it a chance.

How to cite this article: Cygnusmind (2019). ePUB: The native digital format which scientific journals ignore. Retrieved from Cygnusmind´s Blog: https://www.cygnusmind.com/blog/en/article/epub-the-native-digital-format-which-scientific-journals-ignore/

The Digital Publication is not only a PDF

www.cygnusmind.com

The Digital Publication is not only a PDF – CygnusMind´s Blog

by Cygnusmind.com

ARTICLE

The evaluation of scientific journals by Peer Review exists since the 15th century. When The Royal Society published the Philosophical Transactions A y B.  Nowadays, this journal is stills being published and is the oldest journal. Furthermore, is one of the most prestigious and influential. For this subject, in the context of scientific journals publishing, we can illustrate with Plos, e-Life, among others.

Speaking about the digital publication of scientific journals, let’s analyze the Iberoamerican context: the digital transition or the scientific journal uploading has enhanced over time. According to Latindex data, in 2006, there were 1,735 uploads. In 2018, the amount was 8,175 which represents an annual growth of around 500 online journals.

Registered journals in Latindex Directory amount to 26,000 whose 31% is online. Is enough with that number? In our opinion the proportion is small: one in three. Furthermore, if we explore the conditions of the pages, we may see strong problems as lack of update; which affects PDF format, among others. They are online but many of them lack adequate technological resources which present several different technological solutions and techniques used by journals. Being online registered do not define being updated.

The main argument of this article is to emphasize that online journals, or rather digital publishing has not been as successful as it supposed to be. The reason is, overall, printed publication has wanted to carry – copy and replicate the document – into the digital world. To illustrate, PDF files have been worked for several years for this purpose; its advantages are little by little fewer than disadvantages. Nevertheless, our proposal is that a digital publication can only reach its full potential when is separated from atavisms from the printed production process.

Our experience through the years of working with different publishers is that digital journal could only have success if its PDF’s stand out in their domains, with the purpose of think about the enormous possibilities offered currently by digital world.

Let’s enlist some of the main ideas to distinguish what a digital journal is from one that is not.

The limits of PDF are clear:

It does not have necessary hyperlinks

It cannot be totally inserted multimedia files

It is not allowed to move dynamically through the sections of the article

It does not allow an immediate global view of the work

It not possible, most of times, to get the whole text in a transparent way

It is not guaranteed the conservation of works

It does not possess an open source

Is not a responsive format [adaptation in different screen layout and devices]

GIFs placement is impossible

A digital journal is characterized by::

Having not printed pages

Having not a pagination

Not presenting a current frequency of periodicity, which is, paradoxically, opposite to the purpose of being a recurrent publication

Possessing a limit of printed pages more than the 20 to 25 pages which can and should be passed

Having not a specific limit to incorporate and integrate different elements – parts– of the academic discourse like data, formulas, photos, etc.

Differentiating itself from a PDFF

Not demanding a linear reading

Antistatic text and images

Not prioritizing a preview view and a structured design in the message transmission speede

Having a world digital vision, not a local digital vision

Prioritizing subscriptions, not only the links and downloads links

Not having a barter-trade

Not possessing sections

Not defining the importance based on the number of subscriptions

PDF:

Can be downloaded and offers offline reading.

In digital context, the portability can be shown in two manners:

1. By downloading the e-PUB link into your technological device. Is low-cost, is adaptable in any screen and can be shared with anyone.

2. Downloading the HMTL link into an offline reading system such as Pocket, EverNote, etc. It can be read without internet Connection as well as it is adaptable the size and name of the letters.

PDF:

Designing is customized professionally and can be printed.

Printing is a resource commonly less used today. In cases where is needed it, it exists the called printer-friendly PDF format. Fixed design is a bad habit since the time of Gutenberg; is quite difficult to take advantage of the reading.

PDF:

The page is high important to make citation correctly because is that what we want in scientific research.

In the digital world, to cite the number of pages is totally unnecessary. Let’s mention some of them:

1. It does not exist paging because it changes in every format file.

2. Paging had a meaning in Gutenberg’s time when it was hard to find orientation to identify context and learn beyond the file. Nowadays, with such powerful search engines we have access to citation only typing three words between quotation marks, or with a + sign. Furthermore, some styles and formats like APA allow make citation instantly without cite the pages.

PDF:

Can contain comments as well make notes and underlining.

In any reading system we can make underlining or add comments. In some systems, the underlining is saved in a “private library”. In other cases, underlining or selected text may be shared in social networks almost instantly.

PDF:

The socio-scientific networks [Mendeley-ResearchGate-Academia] use it due to its easily transmission.

We should not confuse customs and practices with real advantages of transmission. In all academic publishing areas is must be moved the information into the digital publishing.

PDF:

The equations or mathematical formulas are shown fidelity.

Fidelity versus Replicability? The concept of fidelity of a formula in digital context is refereed to the capacity of reproduce it, the capacity of establish the adequation of the data and conclusions results. Is not enough to know it, is compulsory that we have to replicate the process; to attach the new databases.

The farewell to PDF

In that sense, there exist well known recognized digital reading system besides the typical speed, the “low-cost” and the relatively easy reading.

The full text searches

The widening and handling of images, videos, graphics, tables, etc.

The immediate consulting link -the articulation- of references, mentions, etc. In a word, the immediate access to the digital environment with sound, video, color, images, context, etc.

The Identification – the comparison – and validation of data source, arguments or facts, which are not in the references. We insist in this is not a semantic perspective; is a radically different conception of meanings and senses.

The authors are identifiable and can be follow with theirs ORCID or theirs links in Redalyc or web site.

Many are the advantages of the real migration to the digital journal. We want to enlist two of them whose details we will explain in future articles. Those advantages are not detailed here because, in our opinion, they are qualitative background changes.

We are referring to:

1. The continuous posting.

2. The participation of research actors in the different stages of the publication process.

But if we speak only about the advantages without the challenges, we will be partial:

A. Is an irreversible decision. When is made it, there exists the risk of not being able to stay with the latest technology, that is to say, start with great energy but a little resistance, given a result of technological lag in the medium term.

B. Is an investment in the long run is much more efficient and economical for all the advantages that are achieved in decreasing costs, avoiding wasting time and money. However, we cannot deny the investment in the latest technological advances that is inevitable.

C. The transition by whom? Is one of the greatest decisions to make. We would not be inclined to favor any specific actor involved (publisher, institution or consultant). We must have an idea od the cost, time investment, resources, and the personal immersed into processes of management and realization of academic publication. All these aspects must be assessed.

How to cite this article: Cygnusmind (2019). The Digital Publication is not only a PDF. Retrieved from Cygnusmind´s Blog: https://www.cygnusmind.com/blog/en/article/the-digital-publication-is-not-only-a-pdf/

XML: The DNA of scientific articles

www.cygnusmind.com

XML: The DNA of scientific articles – CygnusMind´s Blog

by Cygnusmind.com

ARTICLE

What does XML mean? Since the emerge of Internet we have heard about a language of information which can be read easily by machines: the XML format. But what is XML meaning of?

The predecessors of XML are the SGML and HMTL formats. The first came into existence in the 80’s when the main problem for electronic publishing was the whatever form to do publishing. The second emerged in the 90’s as a manner to expand the abilities to introduce the information: view, graphic design, etc. A few years later, the XML format looks as a solution to present the information.

The initials of XML come from the word Extensible Markup Language (XML). The XML is a metalanguage, in other words, is a language used to generate other languages. It was developed by the World Wide Web Consortium (W3C). The XML provides a uniform method for describing and exchanging structured data. It describes structures and semantics, not just information format.

Why is so important the use of XML?

〉The content is isolated of any other idea of presentation of the information.

〉International standard independent of platforms.

〉The XML is an open format which can be interpreted in any other application.

〉The XML can be exchanged between other systems whose origin was idealized for this purpose.

If the XML is a metalanguage, everyone can create its language. There exist numerous XML languages created for multiple purposes. The main languages, we can identify the MathML for mathematics, VML for images with vectors, among others.

For the XML serves as a form of communication, it Is necessary to know the same language; there is when its necessary to improve the same vocabulary used to compose the information. A vocabulary is a set of specific words of a specific language or subject. In this context, the DTD [Document Type Definition] and the XSD [XML Schema Definition] describe the vocabulary used in the language; is pointed out the use and forms to use tags in the documents.

What is difference between XML and JATS?

JATS [Journal Article Tag Suite] is the vocabulary used for scientific journals. Using this vocabulary, it is defined the set for tagging the data in scientific articles, among other specifics documents in any specific field. JATS is a technical standard based on the National Information Standards Organization [NISO] in the current version Z39.96 2012 [NISO z39.96-2015 (JATS 1.1; current standard)]. It comes originally from a standard defined by the NML [The National Medicine Library] in USA.

To understand this language, we can think of a catalogue of data of several book, inside a public library. The catalographic tool identifies and separates the database. At the same moment, information can be organized, distributed by year, theme, editor, author, etc. All the possibilities to organize the database depends on tags. For example, in the researchers’ labor, their work is identified by the title <title>The Digital Native Journal</title>; the year , <year>2018</year>; the format <format>HTML</format>; the author<author>cygnusmind</author>; the URL, <url>https://www.cygnusmind.com/blog/</url>; the summary <summary>……….</summary>; keywords…., etc.

However, library catalogue only identifies what we know as FRONT. Is the form which we can identify general information such as author, institutional affiliation, including summary and keywords.

BODY refers to the almost entire text; tagging sections like introduction, methods and results, etc. As well as sequence of the paragraphs, tables handling, execution of formulas, position of images along the text, etc.

BACK or the end of the article is where the tagging finishes in sections like reference section or bibliography consulted. Those sections are of high Importance to generate articulations inside the text; the link to identify citations and produce well-known and valued today bibliometrics indicators such as Journal Impact Factor (JIF), Scimago Journal Rank (SJR), Cite Score (CS), h-index, etc. It is important to point out the existence of dozens of bibliometrics indicators derived from references and multiple forms of data grouping (author, country, institution, etc.), but such results or ending products can be generated is the information is identified.

To understand the complexity and extension of tagging vocabulary in JATS is worth mentioning that around more than 200 metadata exist although many of them are not used consistently. Yet it is tagged more sematic information and interoperability is enhanced.

Besides the markup XML helps in different aspects:

  1. Data recovery: how to excel at the magnitude of information found in the Web and growing exponentially from day to day. Precisely, tagging and marking gives greater possibilities for search engines; for “robots” or “spiders” to find immediate and efficiently. The main purpose provides such information: for example, via metadata, search engines can identify the conditions of use of a content by recognizing Creative Commons licenses.
  2. Digital Identity: In the digital world, identity is something slightly worn out and controversial. It is the path and the trace in the Web; It is a consequence of the content process of interrelationship. In the scientific and academic context is, most of the times, the visibility and positioning related to resources gained, the identity of metadata. To obtain the right metadata and define a frame of success is necessary and essential to excel and present the best scientific, academic, technical and cultural production. In a few words, is what associates the identity [e.g. of a journal, author, institution] into the Web.
  3. Digital preservation: The preservation of printed documents it has been a real problem to solve around the World. However, the digital preservation implicates the capacity of accessing to stored data, besides its backup data. While XML is a standardized format, it may be decoded over time to be used with tools or software created in the future.
  4. Indexing: The current academic production lies in front of the pressure of updating indexes, demonstrating quality and compliance in international standards. This forces an interaction, greater or lesser, changing the indexes to involve the exchange of information in XML files tagged simply as FRONT or BACK for the use of references in dating analysis processes, up to the full text tagged in XML.
  5. Visualization and ubiquity: when it is said that the XML format is the DNA of scientific article it is refers to express the code behind the scientific information which allows from itself to support multiple processes, including the generation of reading formats. Thus, immediate PDF, HTML, ePUB among others.
  6. Interoperability: the XML is the format where scientific exchange activity is configured par excellence allowing to substantially, increase the visibility and reach of the content published on the Web.

For these reasons, we have hesitation in saying XML is the DNA where lies the work of The Digital Native Journal.

How to cite this article: Cygnusmind (2019). XML: The DNA of scientific articles. Retrieved from Cygnusmind´s Blog: https://www.cygnusmind.com/blog/en/article/xml-the-dna-of-scientific-articles/

The Native Digital Journal

www.cygnusmind.com

The Native Digital Journal – CygnusMind´s Blog

by Cygnusmind.com

ARTICLE

In 1665 was born the first scientific journal which took in count the Peer Review evaluation; we talk about Philosophical Transactions of the Royal Society. At the present time, it counts with more than 350 years of publishing uninterrupted. With a great vision could not be different from who is defined as itself as “The independent scientific academy of United Kingdom and the Commonwealth.”

Today we can confirm that this is a really digital journal in the context of electronic journals. The closer touch you get, the better experience you have sharing your articles into a complex communication method. However, we can hardly say what a “digital” journal is in all existence of its meaning. Furthermore, we can think about a concept of communication system where The Native Digital Journal allows you to sum up the whole publishing process; unfortunately, this is a concept whose meaning has emerged late.

While we are talking about structure and manners for exposing the content of scientific journals – not about literary culture or lifestyle-, let’s going to speak clearly in terms of science.  Our hypothesis: Native Digital Journals are limited, and they don’t know how, paradoxically, to profit the advance of digital technology for a better way of communicating its messages properly. As a conclusion, we are unable of find Native Digital Journals. What we can find are electronic journals, most of them, as a poor copy of its printed versions. In other words, the way to configure parameters remains limited by the paradigms of paper and printing; they are still being published only in PDF version and lack the advantages of digital technology.

Discuss this further.

A journal is native digital when has a solid concept, a design, and a conception purely from the beginning in the digital realm. At any time, along all the processes of discourse, opinion, elaboration, publication, distribution, dissemination, collecting works, conformation of authors and readers’ community, it never interferes the role and logic of the printed matter. On top of that, this is the way for enhancing positioning and consolidating a well-recognized prestige.

The Native Digital Journal is developed, manifests itself and is still modified in the digital sphere, although it has emerged many decades ago. “The World Wide Web is not just its medium of diffusion, but its backbone that holds the micro-system of scientific communication which represents every publishing” (cygnusmind.com).

What’s more, hypertext is the organization of information – text, data, sounds, images, etc. – through articulations and ligatures, colloquially called “links”. Therefore, this step allows to branch the information as it’s explained below:

  1. To do a deep reading, follow-up, and a consulting not necessary in a sequential way. You can star reading from whenever you want; you can address scientific texts from the method, discussion, conclusions or even sources of information.
  2. The participation of the individual – reader, viewer, reviewer – determines the plan – direction – the time, and the manner of the use of links; the individual defines when and how in every moment: he leads an active role exploring the information.
  3. The text – in Pair-Review – represents the substantial element in the scientific matter. Moreover, talking about digital realm, the DNA of the speech, scientifically, it is given by the structure and display of information from the digital code: the text in an inherent way.
  4. The digital code gives an advantage to necessary requirements for science:

a. Replicability

b. Construction of a new science from the existing

c. Visibility

d. Interoperability

In this direction, we could consider that mathematical formula, chemical reactions, equations, and more procedural elements cease to be only mathematical or reading signs. They became in execution actions processing data and repeatable and reusable methodologies. The symbol call for action in the digital formula.

The tables contain data and actions that can be processed, sorted, graphed. They are active information units which can be interrogated and interpellated. What is the main raison for? Science knows that conclusions depend on how reality has been questioned – the method – and soundness of them depend on characteristics of data about reality too; of course, better having them in hand

Apart from that, In the digital realm, talking about visualization context, the image ceases to be a marginal or ornamental reference for becoming a source of information itself, allowing us to enrich and add unthinkable elements to discourse. A tomography for the detection of cancer, a microscope element, a set of stars, etc. The image can be represented as a coded information with shape and color in each pixel.

For the first time movement, interaction and sound can be integrated into a scientific text, not only to improve the reading experience, but to generate more knowledge from enriched elements.

Between all the existent information – more than a million articles are written a year – it’s unmanageable and inoperable to research and/or locate something on the Web. This unmeasured information can be only processed by machines for routing to the readers. To do so in, the information must be processed, sorted, filtered, grouped, treated; this will depend on the way in which it is structured.

To this effect, tagging information in XML under JATS standard is the DNA of the text; is the beginning of the semantic web and the knowledge of wisdom; without doing this, it does not exist The Native Digital Journal.

The Extensible Markup Language – better known as XML – is the language of machines and JATS (Journal Article Tag Suite). This language defines the sets of labels – metadata – into which the information of a scientific article is structured. In this way, we can clearly locate the titles>The Native Digital Journal /title; the year>2018/year; the format>HTML/format>; the author(s)>cygnusmind/author>; The URL>https://www.cygnusmind.com/blog/</url>; summary <summary> ………</summary>; palabras clave, etc.

The XML-JATS add value in each phase into the process.

On the other hand, the elaboration of discourse consists in provide semantic and structure beyond from launching and written form. Its scope of action contributes by modifying the way in which the expression of a finding is thought, that is, without limit of pages, attachments, images in very high resolution, visualization formats, etc., which can allow an article marked in XML. Let’s explain this below:

  1. Formation: Interpretation of a text from the XML file to generate different reading formats as PDF, ePUB, HTML, and others.
  2. Publication: Online production of digital formats of scientific articles for an enriched reading and appropriate use of information.
  3. Distribution and Dissemination: Provides accurately and adequately information to search engines, which facilitates articulation and visibility. Also allows information to be present in thousands of libraries, content aggregators and specialized portals around the world.
  4. Collection of works and community of authors and readers. As expressed by one of the five basic principles of library science: “To each user their information and to each user’s information”. A text with structure and semantics reaches interested readers and authors increasing the reach beyond geographic or language limits to form a solid community that will support the journal.
  5. Communication and interaction. Globalization allowed us to know the social diversity of processes and to recognize that information is produced in multiple languages. Paradoxically, the use and evaluation of HTML has been lost when it allows us the automatic translation (treacherous traditore) of various languages, with imperfections but in permanent advance.
  6. Preservation. The Native Digital Journal having XML-JATS is preserving the content to future formats not known and guarantees its adaptation to the future technology, taking independence of media, formats and, of course, of trademarks. This is what the DNA of Native Digital Journal allows to do.
  7. Positioning: building prestige and recognition of a scientific journal. By increasing scope and visibility there is a greater likelihood of obtaining the expected impact. Similarly, by increasing the collection of academic works, selectivity and rigorous quality criteria contribute to the prestige and recognition of the community.

Thus far, this is a profound evolution in the transmission of knowledge. When the content of the paper media is released, various aspects of the scientific publishing and editing process can be rethought much more freely: they do not exist in the paging digital journal nor is it necessary to quote the page. There is no limit on the length of an article or typographical requirement, etc., and the different characteristics of the printed must be reconfigured for the digital environment: typography, spaces, alignment and all the typographic aspects must be thought for the ease of reading in electronic devices.

The “continuous” Native Digital Journal: there is no reason, no one, to avoid the uploading of any article online when it’s finished – when the Peer Review and the style editing is finished – it is not done by an atavism of its version printed and of the classification systems with practices that do not fit the digital realm and The Native Digital Journal. The classification system, volume, number, year, page, and a certain number of works are an atavism typical of the parameters of publication of the printed. So, when a number was to be prepared with a set of items, and the design was supposed to be send to designer, all the process resulted in archaic protocol. Today that process is not necessary. The design is done over programming and output in HTML, ePUB, PDF, etc., is immediate and with the personality of the editor under the concept: it moves and adjust to the document.  In this regard, presenting numbers in advance could be considered an inappropriate editorial practice because of the very confusion involved: referring to something of the future in the present that already exists. All this can be resolved by becoming a “continuous” Native Digital Journal, because in the digital realm, the online setting is due to an editorial decision not subordinated by a technical process passed (layout).

Today the world of scientific communication faces a complex situation about roles and processes. The editor has been and is the guarantor of the quality and integrity of journals’ content, supported by the anonymous face of the reviewers. But the communication of information and the Native Digital Journal demand high technology, knowledge and continuous renewal. The editor needs allies to help the journal take the step; it’s not just a technological and programming issue, it’s a conceptual issue transferred to the technological realm. It can be resolved by those who have the experience in the publishing world but know how to take advantage of the enormous advantages of the technological world.

How to cite this article: Cygnusmind (2019). The Native Digital Journal. Retrieved from Cygnusmind’s blog: https://www.cygnusmind.com/blog/xml/the-digital-native-journal/