Reusable data is the new oil

Abstrak. New oil data is 21st century jargon. This movement has not been widely echoed in Indonesia. Although some initiatives should be recognized and appreciated, the status of the availability of reusable data in most countries, especially in Indonesia is still low. Most of the data published in Indonesian open access journals are in the form of pdf files that cannot be reused. We advise editors of Indonesian scientific journals to consider adopting FAIR data sharing by encouraging authors to share their data as additional files in a machine readables format, e.g. csv or xls. This effort will also contribute to the principles of transparency and sustainable development in Indonesia's research ecosystem.


Introduction
Data is new oil [1]. In conducting research, data is the main component. However, data may not be a significant component of published papers [2][3][4]. Most of the data is shared as non-reusable PDF files along with the descriptive text. The data itself cannot be directly reused for further study. Other researchers have to retype the data to research and develop it. This article explores the importance of reusable data, how to share it, and evaluates the availability of reusable data in scientific papers published in Indonesian and international earth science journals.

Findable
Data needs to be found by search engines, either general web search engines or specific ones looking for scientific material [5]. Standard search engines here, for example, are Google or Bing, while specific search engines, for example, are Google Scholar, Microsoft Academic, or Lens. The main requirement for a document to be found online is complete and openly uploaded metadata.

Accessible
Data needs to be able to be downloaded and opened without unnecessary authorization. Often, we share files via Google Drive links that are password protected. Protection is indeed required for certain types of data, for example, data that does not have a good data usage agreement, as well as sensitive data [5].

Interoperable
Data needs to be interoperable using various applications on various platforms; for example, data needs to be opened by applications running on Linux, Mac, and Windows operating systems. Here the file format should be generic and can be opened by various applications [5]. An example that we can give here is a table format which generally uses the xls or xlsx format. Although Microsoft Excel has become the standard nowadays, keep in mind that binary format files such as xls or xlsx can only be opened using specific applications, which also depends on the application's version. The recommended data format is non-binary or text, so it can be opened using various applications, such as txt or csv formats.

Reusable
Data needs to be shared in a way that makes it easy for others to reuse [5]. The effort expended needs to be minimal; for example: in geoscience, maps shared only in raster format (e.g. jpg and png) will not be reusable by readers. Additional efforts in the form of replotting are needed.

Materials and Methods
We explored the status of data availability in three high profile national journals in the field of geosciences listed in the SINTA database. Steps: 1) visit the site sinta2.ristekbrin.go.id, 2) open the "Journals" menu, 3) use the keyword "geo". We got three journals: the Indonesian Journal on Geosciences (IJOG), Geoplanning: Journal of Geomatics and Planning (GGP), and the Indonesian Journal of Geography (IJG).
We compared the results with three international journals in the field of earth sciences that we got from the Scimago Journal Ranking list. Steps: 1) visit scimagojr.com, 2) activate the filter subject areas "earth and planetary sciences: and filter the subcategories "earth and planetary sciences (miscellaneous)". List of three journals: 1) Nature Geoscience (NG), 2) Annual Review of Earth and Planetary Sciences (AREPS), 3) Earth System Science Data (ESSD).
The observations are stored as supplementary files in csv and xls formats (Table 1). Please refer to the Data Availability section.

Results
The three national journals are high-profile earth science journals entered in the Sinta 1 category. All three have been entered into the DOAJ database, but only IJOG (Indonesian journal) whose papers have been entered into the Scopus database. All national journals are open access journals, published in hybrid, electronically, and in print. Non-profit institutions also publish all three in the form of universities and state institutions. Even though the journals are open access journals, only one journal charges an article processing fee (APC), namely IJG of USD 300; the other two journals do not charge an article processing fee (APC) at all. In the writing guide, the three national journals do not have any data sharing policy. Based on a random review of several publications, we found no articles with supplementary data. All data is packed in PDF format along with texts.
We compare those national journals with international journals, NG, AREPS, and ESSD. They are all high-profile journals and are recorded in the Scimago database (Quartile 1). Among the three, only NG is managed by commercial companies. It publishes articles in hybrid mode, offering both non-OA and OA routes to authors. The APC for the OA route is more than USD 11,000. AREPS and ESSD are association journals. ESSD is an OA journal with no APC, published by the European Geoscience Union (EGU) in collaboration with the Copernicus publishing company. At the same time, AREPS is an association journal published on a non-OA basis. Readers can read the articles by paying a subscription fee.
European publishers have been exposed more extensively to data sharing, especially with the massive open science movement in the European continent, e.g. with Plan S and Horizon2020.

The importance of data sharing
Data sharing is essential because retrieving data costs money, and for the field of earth sciences, it can be costly; for example, the price of a single drill can reach millions of dollars. Costly prices will be more effective when used for more than one purpose. Another benefit is that the data can be used for different purposes; for example, drilling data from oil and gas companies serve as an exploration tool, but the same data can also be helpful for local governments to find out the subsurface conditions of their area. Local governments cannot fund expensive subsurface data.
Another benefit is that the data can be appropriately managed. The open science guide is highly recommended to make research data management for each research activity. This step can make data management more planned, in which parts of the data can be shared, with whom, and which parts cannot be shared.
The next benefit is that the data is stored better and can be searched by the public [8,9]. Amid today's connected internet, online search engines are the most effective and efficient reminder engines. When data has been uploaded online, search engines can find it.
The following fundamental benefit is that by sharing research data, the verification process will be carried out by readers. The reader can detect the errors made by the researcher if the data is shared fairly. Furthermore, further researchers can also carry out further analysis or re-analysis of the data. They can even combine old data with new data to form complete time-series data. The analysis will be more in-depth. In the end, science will develop faster [10][11][12].
Based on our observation, the awareness of the data sharing policy of Indonesian national journals is low. More education to promote good governance of data sharing is needed.

The earth science is people's science
Although geoscience is often perceived as a hard science because it deals with deep underground objects, earth science is one of the sciences related to the lives of many people. This situation will not change when scientists only communicate their knowledge with each other. Earth science is a science belonging to the community. Therefore, its delivery must be made as friendly as possible with them [13].
Data is one of the connecting points between this science and other sciences understood by the public. By applying the FAIR principle, the public can obtain geological data as quickly as they can get information on the price of essential commodities. With the advancement of the connected digital world, it is straightforward to do. Scientists can upload their data to places recognized by search engines, such as campus or office repositories such as the National Scientific Repository managed by LIPI (https://rin.lipi.go.id) or open public repositories (Zenodo -https://zenodo.org). Social media such as ResearchGate (https://researchgate.net) may be used, subject to terms and conditions, relating to its status as a business product.

Data as public goods
Data Data is a public good, primarily if the research is publicly funded. Its public nature is limited to data that is not sensitive. Privately funded research data may be excluded, although providing public data can improve the reputation of the private sector [14]. Data not supported by a data usage agreement is also prone to conflict when made public [13,15].
Public data is a form that fits perfectly with the goals of open science, which makes data easy to find for purposes of transparency and accountability. This principle is very relevant to the recent conflicts that stem from the lack of data disclosure [14]. Reluctance to disclosing data is generally due to reasons that are no longer relevant, such as fear of data being stolen and misused and fear of research ideas being taken by other researchers. Developing a research data management plan (RDMP) by selecting sensitive data can be overcome those fears [15]. Each field of science may have different characteristics concerning data sharing, data release, and data usage agreement.
Some journals currently require authors to provide the raw dataset in a repository and share the data under a public domain waiver license (CC0). One of the examples is the F1000R journal.

Conclusions
The description above can add to the argument that FAIR data is the new oil. Data can be fuel for future research. Just like oil, reusable data could spark more ideas that would give more benefits at large. Fear of scooping must not restrain the willingness to share data. Data sharing should be the default mode in research dissemination and more education in sharing sensitive data. It will trigger a more multidisciplinary approach to the same problem.