Modeling Time-to-Acceptance for ISI-Indexed Journals in the Profession of Library and Information Science

There are many factors affecting review duration after a paper has been submitted to a journal. Developing a time-to-acceptance model of each journal for the whole time span from submission to acceptance can help researchers when they are selecting journals to publish research results, as well as help editors when they are optimizing workflow and strategy. Using ISI-indexed journals in the profession of library and information science as an example, this study aims to explore the possible patterns of time-to-acceptance for refereed articles. Based on the theories of maximum likelihood estimation, this article models probability distributions for the retrieved data through the R package fitdistrplus. The Kolmogorov-Smirnov test is further used to determine if the distribution for each journal can be accepted.


INTRODUCTION
In academia when knowledge is generated, it should be made available and distributed efficiently. Scholarly publication is one of the major ways that knowledge and research results are disseminated. Palese, Coletti, and Dante (2012) advocated that the scientific world needs to reflect on publication efficiency and its mechanisms. Both journal editors and authors are keen on the speed of the review process, because timeliness is one of the important factors to the journal's reputation, as well as the factors that affect the authors' decision in selecting journals to submit manuscripts to for publication (Hodges, Elsner, & Jagger, 2012;Chen, Chen, & Jhanji, 2013).
The process of peer review and publication has changed in the past decades from handwritten manuscripts to electronic versions, significantly reducing the processing time from the first submission of manuscripts to the final step of being published. However, there are still many random human factors influencing the review process and these factors cannot be easily measured. Thus, instead of quantifying all random factors, the authors seek to develop a mathematical model that covers the overarching period and takes all factors into one umbrella for consideration is critically important. Using classical statistics, this study aims to explore the possibility of estimating time-to-acceptance of refereed articles published in ISI-indexed journals in the field of library and information science. Based on the data found on the journals' websites and the developed mathematical models, the paper proposes fitting distributions to data to estimate timeto-acceptance of refereed articles published in the journals. The results of this article can be used as a reference tool for authors who are working to meet a deadline, as well as for editors who are optimizing workflow. Researchers can also use or expand this paper's methodology to develop estimation models for journals in other disciplines.

Studies of Journal Publication Speed and Impact Factor
Researchers from various disciplines have studied journal publication time and analyzed the correlations between the publication speed and impact factors. Some theoretical models that were developed by researchers suggested that those journals that publish more rapidly increase the likelihood of citations of the published articles, which contribute to a higher impact factor (Ray, Berkewits, & Davidoff, 2000;de Marchi & Rocchi, 2001;Metcalfe, 2002;Yu, Wang, & Yu, 2005;Yu, Guo, & Li, 2006;Pautasso & Schafer, 2009). In a recent study of journals from seven different disciplines, Lievers (2013) found that a negative correlation was found between the journal impact factor and acceptance time. Lievers claimed that this research demonstrates a representative pattern in the broader scientific literature; manuscripts are processed faster both in journals and in journal categories with higher impact factors.
However, some researchers have reached different conclusions through their studies. In studying the journals in ophthalmology, Chen et al. (2013) recorded that the individual median peer review time (from submission to acceptance) ranged from 35.5 to 263 days, with a combined median time of 133 days. They did not find any correlation between the impact factor and the publication time lag by running the Spearman test. However, they agreed that publication time lag of a journal is one of the key factors affecting an authors' decision in selecting journals for publication.

Publication Deadlines and Tenure Track Expectation
"Winning the tenure game is not about what you do; it's about when you do it." Russell James (2014, p.39) articulated in his book and explained that the faculty member's dossier is typically submitted in the fifth academic year because tenure evaluation happens in the sixth year of employment. When considering the starting date of employment and the academic calendar, a faculty member usually has four years and seven months to build a strong dossier, in which only published or accepted publications can be included. Getting an article accepted and published in a peer-reviewed journal can be a tedious process with a long wait. The whole process from the editor's preliminary review, first round peer-review, revision, resubmission, second round peerreview, to editor's decision takes a long time, resulting in the real tenure track clock for the faculty being three years instead of six.
It is challenging for tenure-track faculty to select the right journal to publish their research articles. This is especially true for junior tenure-track faculty working in academic libraries. James (2014) suggested that in order to identify a realistic amount of time, tenure applicants should prioritize journals based on how fast the journals complete the review process. The best theoretical strategy is to start with the highest ranked journals because of the shorter responding time and quicker publication speed. Taborsky (2007) also mentioned that journals with quicker turnaround times are usually ranked higher. Taborsky then proposed that authors choose a journal based on the average time to publication mainly for two reasons: the impact factors as well as the citation statistics are affected by the publication speed; and the delays in the publication will adversely affect the evaluation of the researcher's academic dossier.
In manuscript preparation, an inquiry into individual journals is of significant help as well.
Recognizing the authors' concerns pertaining to the available information of publication efficiency, the database Cabell's Directory of Publishing Opportunities collects and makes available journal information from a variety of disciplines about the review process, time to publication, time to review, and so on. However, time to publication and time to review are not available for all journals. The Directory does not specify the sources of these times or how they are calculated.

Publication Time Concerns from Authors and Journal Editors
A shorter publication lag probably facilitates the distribution of research findings or enhances the impact of research achievements as well as the impact of the journal itself. From these points of view, both authors and editors value timeliness of publications and consider it a quality indicator of a scholarly journal.
In the profession of library and information science, Greifeneder (2013) provided explanations on the peer review process to help researchers understand why the reviewing process requires a certain time span. Moreover, in the journal "Library Hi Tech" Greifeneder published five rules for researchers to follow to expedite the review process. Diospatonyi, Horvai, and Braun (2001) agreed that publication speed is one of the factors that determines a journal's quality. The results from a global survey of 554 authors about the quality and impact of occupational therapy journals showed that timeliness of review and publication ranked in the top four in importance among 11 quality indicators (Rodger, McKenna, & Brown, 2007). In recent research, Adler and Liyanarachchi (2015) collected authors' views on the editorial review processes of 42 accounting journals through a webmail survey. Eight hundred and fifty-six respondents from all over the world expressed their satisfaction with the overall editorial review process in general. Nevertheless, survey results indicated that some journals do not successfully provide prompt editorial feedback.
Additionally, significant differences were witnessed for the timeliness of reviews given a journal's editorial office location, perceived rank, and sub-discipline. The researchers argued that timeliness of review and publication might be considered essential measures of journal quality; they also believed that delays in review and publishing created a negative influence on the work's impact, especially if its implications are time sensitive.
Efforts of streamlining the article review process and speeding journal publication time have been attempted by journal editors. The gynecologic oncology group have integrated resources among its member institutions, optimized the manuscript development process through prioritizing resources and monitoring compliance with deadlines, and eventually improved the time to journal acceptance by an average of 346 days (Bialy, Blessing, Stehman, Reardon, & Blaser, 2013). The editor of the Journal of Manipulative and Physiological Therapeutics, Claire Johnson (2005), in an editorial, indicated that she hoped to improve the quality and timeliness of manuscript publication by instituting mechanisms including electronic submission, submission pre-review, publication priority hierarchy, and rapid review.

Statistical Model of Time-to-acceptance Prediction
Looking at publication time from a different perspective, Hodges, Elsner, and Jagger (2012), applied Bayesian and the Markov chain Monte Carlo approach to develop a prediction model for time-to-acceptance in the profession of hurricane studies. It is the only predictive model for timeto-acceptance of refereed articles found in the literature at this time. The model they developed benefits editors in foreseeing the number of manuscripts ready for publishing; the model also helps authors estimate the probability of meeting a deadline, such as a tenure review or a conference research panel with a fixed due date.
In their research, Hodges, et al. (2012), collected 133 articles published from January 2008 to December 2010 in ten American Meteorological Society journals with the keyword "hurricane," and defined the temporal difference of the time-to-acceptance as "τ," which is the statistic of interest. Because gamma density is commonly used to model time periods, the authors assumed τ is a random variable having a gamma density, placed a uniform prior distribution on the parameter vector, and deducted the posterior density based on Bayes' theorem. Given a pair of parameter values, the authors calculated the posterior density using the programming language R. The authors then used contour functions for the joint posterior of the two parameters. Using the Markov chain Monte Carlo approach, 1000 random samples were drawn and plotted to an ideal contour. The model fit was checked by examining quantiles from the data against the same statistics from the posterior draws.

PROBLEM STATEMENT
Based on the literature, it is clearly beneficiary for both editors and researchers to estimate the time span between submission and acceptance for individual journals. Editors may be interested in estimating how many manuscripts are ready to publish for each upcoming issue and in maintaining a reasonable inventory; while researchers would like to know in advance how long it would take for their submitted manuscripts to go through the entire review process and be accepted. However, no prediction models for time-to-acceptance have been established other than the one developed by Hodges, et al., (2012) on hurricane study articles. Regarding their prediction model, it was developed based on the assumption that the timeto-acceptance of articles retrieved from ten American Meteorological Society journals with the keyword "hurricane" follow a gamma distribution. As stated in their study, the methodology can be adopted with other search criteria: "the less specific the criteria (e.g. "hurricane" or "tropical storm"), the smaller the variance (large sample size) on answers to inferential questions but the larger the bias on those answers relative to specific interests (Hodges, et al., 2012, p.882)". Usually, the coverage of a refereed journal is broader than one specific topic, however, this model is not ideal for estimating the time-to-acceptance of an individual refereed journal. Therefore, the authors of this article aim to investigate an alternative solution for the journals in the library and information science field, and seek to answer these questions: With all these questions, the authors examined the possibilities to estimate the period between manuscript submission and acceptance of an individual journal based on existing data. The estimated overarching period takes all factors that influence review time into consideration.

Methodology and Theory
In this study, time-to-acceptance is defined as days between "submitted" (or "received") and "accepted." Aiming to analyze and estimate the period for manuscript review and revision, the authors created three selection criteria: 1) only research articles, case studies, or literature review articles should be included; 2) accurate duration information (month, day and year) must be available; and 3) time-to-acceptance is at least one day. Since only published or accepted articles can be found, the study does not include rejected manuscripts. That is, time-to-acceptance is based on the fact that the articles have been "accepted" by refereed journals.
In order to develop a comprehensive understanding of the publication lags between manuscript submission and acceptance, the authors reviewed all 85 journals in the discipline of library and information science from Thomson Reuter's ISI Index list. The authors found that 24 journals contain dates of submission and acceptance in the published articles. The list of 24 journals and their abbreviations used in this paper are included in Table 1. Interlending & Document Supply* IDS 11 International Journal of Information Management* IJIM 12 Journal for Association of Information Science and Technology JASIST 13 Journal of Academic Librarianship JAL Continued on next page

Exploratory Study and Duration Selection
The authors conducted an exploratory study on the journal Library Hi Tech to explore the possibilities of developing an estimation model for time-to-acceptance. As stated in the Author Guidelines of the journal, "Each paper is reviewed by the editor and, if it is judged suitable for this publication, it is then sent to two independent referees for double blind peer review" (Emerald Group Publishing, 2014). Therefore, the overarching span of submission-to-acceptance in this study covers the complete process of editor review, referee review, and revision. Factors in the process that may affect the duration are not discussed in the research, but have been included in this modeling.
Library Hi Tech releases received, revised and accepted time information on the PDF of each article from 2004 (Volume 22) to 2015 (Volume 33). Out of 46 issues, 38 have full date information while the remaining eight contain only the month and year. Table 2 shows a complete collected data for the journal, including the average, median, maximum, and minimum time-toacceptance (in days). The authors observed that the annual average and median time-to-acceptance have been comparatively stable within three time periods: 2004 -2007, 2008 -2011, and 2013 -2015. In order to confirm if the range of data can be used for further research, the Levene test was used to evaluate the homogeneity of the collected data. The test confirmed a homogeneity of  Another criterion to select range of data for the study is the recentness of publications. Academic journal review and publishing have been undergoing changes as a result of technological advancement and process improvement. The earlier data may not be relevant to what has happened recently or is happening currently. Additionally, the time-to-acceptance is influenced by fixed factors such as review procedures, policies, criteria, reviewers and editors, as well as random factors such as workload of a reviewer during a specific time period, and efforts for revision from authors. Thus, authors believe that the most recent data of time-to-acceptance is more valuable in foreseeing the expected publication in the near future, because fixed factors are most likely to be consistent and stable. In the literature, Greifender (2013) held a similar point of view and used the most recent publishing data instead of the old one for his research. Based on the exploratory test on the journal and the selection criteria suggested in the literature, the authors selected time-toacceptance data from scholarly journals published in the most recent years, including 2013, 2014 and 2015.

Journal Selection and Data Collection
Dates of the articles submitted, received, and accepted are usually made available in the PDF or HTML version. To avoid time-consuming manual data collection and lessen the possibilities of human errors, the authors downloaded citation data in BibTeX format and programed Perl scripts to retrieve "submitted and accepted dates" in batch from the HTML version of articles. For those dates only available in PDFs, the authors had to employ manual data collection.
Some of the 24 journals provided incomplete data for the selected years. For instance, articles of some volumes and issues do not contain date information, or contain only the month and year of publication. To avoid research bias caused by imcomplete data as well as to secure the reliability of the research results, the authors had to remove five journals from the research, including: Government Information Quarterly, Interlending & Document Supply, International Journal of Information Management, Journal of Documentation, and Library & Information Science Research.

Theory
Distribution fitting is the matching of a probability distribution to the observed data concerning the repeated measurement of a variable phenomenon. The primary objective of distribution fitting is to forecast the probability or frequency of occurrence of the magnitude of the phenomenon in a certain interval. The principle of distribution fitting is to find the type of distribution and the value of parameters that give the highest probability of producing the data. In this study, before fitting distributions to the collected data, the authors selected distribution candidates through observing the histogram, then used the maximum likelihood estimation (MLE) to calculate the parameters of each distribution candidate and obtained the respective log-likelihood values, Akaike information criteria (AIC), Bayesian information criteria (BIC), and the parameters of the distribution (Lee & Wang, 2013). These values measure the quality of each probability distribution and provide a means of distribution selection.
AIC (Akaike, 1969), a widely-accepted criterion is based on log-likelihood values, and r is defined as = l(bˆ) − 2 Where l(b^) is the log-likelihood value, b^ denotes the MLE of all the parameters in the distribution, and p is the number of parameters in the distribution. Given a set of candidate distributions, the preferred distribution is the one with minimum AIC value.
BIC (Schwarz, 1978), another widely-used criteria, is known for penalizing the number of parameters more strongly than AIC. It is based on the log-likelihood, the number of parameters in the distribution (p), and the total number of observation (n). Similar to AIC, among a group of candidate distributions, the one with the minimum BIC value is preferred.

= l(bˆ) − 2 log
According to Lee and Wang (2013) and NIST/SEMATECH (2013), the Kolmogorov-Smirnov test can be used to compare the samples from the estimated distribution with the empirical distribution, and determine if the null hypothesis can be accepted.
The Kolmogorov-Smirnov test statistic is defined as Where F is the theoretical cumulative distribution of the distribution being tested, which must be a continuous distribution and must be fully specified.
In addition to maximum likelihood estimation (MLE), the R package fitdistrplus offers estimation method as a moment matching estimation (MME), quantile matching estimation (QME), and maximum goodness-of-it (MGE) using eight different distances. These estimation methods are used to determine a probability distribution modeling the random variable, and to find parameter estimation for that distribution (Delignett-Muller & Dutang, 2014).

CALCULATION & DISCUSSION
The authors imported the retrieved data of time-to-acceptance from each journal to the statistical software R, and employed the package fitdistrplus for matching and graphing distribution. The calculation is started with plots of the empirical distribution function and histogram using the plotdist function from the fitdistrplus package. Figure 1 illustrates both empirical density (and histogram) and empirical cumulative distribution function (CDF) plots.

Figure 1. Histogram and CDF plots of an empirical distribution for LHT time-to-acceptance 2013-2015
In addition to empirical plots, skewness and kurtosis were also calculated to facilitate the selection of distributions. The function descdist was used to estimate skewness and kurtosis and results were plotted to a Cullen and Frey graph (See Figure 2). The results of journal LHT demonstrated a positive skewness and a kurtosis of 3.66, which matches the right-skewed empirical distribution in Figure 1. The three common right-skewed distributions, Weibull, gamma, and lognormal distributions were thus taken into consideration in this study. In the R package fitdistrplus, the function fitdist returns parameter estimates, estimated standard errors, log-likelihood, Akaike and Bayesian information criteria (AIC and BIC), and the correlation matrix between parameter estimates. It also provides four classical goodness-of-fit plots: 1) the density plot represents the density function of the fitted distribution along with the histogram of the empirical distribution; 2) the CDF plot displays both the empirical and fitted distribution; 3) the Q-Q plot demonstrates the empirical quantiles against the theoretical quantiles and underlines the lack-of-fit at the distribution tail; and 4) the P-P plot shows the empirical distribution function evaluated at each data point against the fitted distribution function and emphasizes the lack-of-fit at the distribution center.
The goodness-of-fit plots (Figure 3) indicate that both Weibull and gamma distributions fit the data graphically at least, while AIC of Weibull fit (1243.631) and BIC (1249.086) are higher compared to those of the gamma fit (AIC = 1242.525, BIC = 1247.98) respectively, meaning gamma instead of Weibull should be selected. The P-value of the null hypothesis for gamma distribution using the Kolmogorov-Smirnov (KS) test simulation is 0.9376, confirming the journal LHT time-to-acceptance data is compatible with a gamma distribution. Bootstrapping was used to add pointwise confidence intervals to estimate gamma CDF (See Figure 4). Following the same procedure, the authors applied the same distribution strategy to the data collected from the 18 journals (Table 3).

TABLE 3. Descriptive Statistics of the Select Journal for Further Study | t = days
Data from two journals, Information Technology & People (ITP) and Library Resources & Technical Services (LRTS), did not follow any distributions in the designed calculation. Figure 5 and Figure 6 show that two or more crests are observed in both histograms of ITP and LRTS, which explains why the R package fitdistrplus cannot find a distribution for these two journals' data. It is possible that some of the fixed factors affect the stability of these two journals' operation, factors including technical upgrades, major changes in procedures or policies and different review spans for several dominant topics. The time-to-acceptance data from the remaining 16 journals perfectly follow three main distributions: gamma, Weibull, and log normal, ranging from 1 to 1500 days on the x-axis (Appendix A. Figure 8-19 and Figure 21-24). Another journal, LHT is included in the exploratory study process (Appendix A. Figure 20). In order to have a better comparison of the 17 journals, Figure 7 illustrates the entire picture of 17 distributions in the range of 500 days on the x-axis. Assuming the article eventually gets accepted, from Figure 7 one can find out the possibilities of an article going through the entire peer review process within the range of 500 days after submission. Rejection rate and data are not considered in this research. Relating the view of Figure 7 to Table 3, one should also notice that the journals whose average and median time-toacceptance days were less than 200 float on the top left of Figure 7; that the four journals with an average time-to-acceptance greater than 400-days sink at the bottom right; while the rest of the journals stay at the central part in Figure 7.
Indeed, average and median of time-to-acceptance may allow researchers and editors to roughly compare the publication lags of these journals. However, graphics with accurate percentage serve this purpose far better. For example, Figure 7 illustrates the fact that all articles accepted to the journal Library Hi Tech (LHT) have completed the review process within one year; while submissions to Journal of Strategic Information Systems (JSIS) have only a 30% possibility of completing the review and revision process within the same time frame.
Combining the time-to-acceptance possibility with the acceptance rate of submitted articles, journal editors can estimate the number of manuscripts ready for publishing. For instance, assuming Library Hi Tech (LHT)'s acceptance rate is 70% and it receives 15 manuscripts every month, the editor is able to estimates that approximately 10 manuscripts will be accepted. Considering the time-to-acceptance distribution of Library Hi Tech, the editor is also able to predict that in five months from the date of submission, eight manuscripts (80%) are likely to be accepted. By doing so, editors may foresee available manuscripts for each upcoming issue and maintain a healthy inventory of submissions.
When researchers are selecting journals to publish their research articles, timeliness is one of the most important factors for them to consider. Other factors they need to consider include: scope, audience, acceptance rate, citation style and impact factor. The distribution of time-toacceptance may answer the question about the timeliness of a journal by estimating the timeline of the review process. For example, when an author finalizes the manuscript but has only five months (150 days) left before a deadline of some evaluation, it is to his/her best interest to identify the journal in the profession with the fastest turn-around time. By looking at the distributions (Figure 7), the author is able to find out that having 150 days, the probability of getting the article accepted by Library Hi Tech is 80%, Journal of Academic Librarianship 70%, Journal of the Association for Information Science and Technology 60%, and Information & Management 10%. Thus, if the author selects Library Hi Tech as the target journal, they should have the highest likelihood of completing the entire review process in time.

CONCLUSION
Publication efficiency is a topic that has attracted attention of scholars in various areas. Although the researchers have not reached an agreement on the relationship between the impact factors and the publication time lag of a journal, they believed that the time span between submission and acceptance is one of the key factors that effect a journal's reputation as well as the authors' decision in selecting journals for publication. Selecting a journal with a higher impact factor or quicker turn-around rate appears to be the best strategy for researchers to meet tenure review deadlines or a research panel with a fixed due date. Moreover, journal editors have also made efforts in facilitating the distribution of research achievements and improving the journal's prestige by optimizing the review procedures and enhancing the review quality.
Reviewing the literature pertaining to the studies of journal's publication speed, one statistical model was found for predicting time-to-acceptance of articles on the subject of "hurricane." However, as stated by its developers, the model is not applicable for an article set with a wider range of subject fields. The authors of this article then explored possibilities to establish a mathematical model to predict journal article's probability of acceptance.
Examining the list of total 85 ISI-indexed journals in the profession of library and information science, the authors found that 24 of them make dates of submission and acceptance available online, but only 17 of them contain valid data for the study. Based on the available data retrieved from the journals' website, the authors were able to develop estimation models of timeto-acceptance for the17 journals, covering a range of research and practice areas in the profession of library and information science.
Regretfully not many journals release the data for their review process to the public. Releasing such data not only allows derivative studies such as researching relevancy of factors or modeling distributions of durations, but also helps researchers have a reasonable expectation of turnaround time for submissions. Therefore, the authors of this study call for refereed journals, hopefully in all disciplines, to make available the data of date of submission, revision, acceptance, and publication in published articles.
In this study, the authors also noticed that some journals make available revision dates for some published articles. By looking at the statistics of two of these journals and conducting a simple calculation, it seems to the authors that the majority of published articles require revision before acceptance, for example, 95.58% of accepted articles require revision by LHT, while 75.65% by JASIST; on average it took two more weeks for revision-required articles to get accepted by the journals. The data also shows that the more revision required in the review process, the longer average time it took to get the article accepted. Greifeneder (2013) has suggested that authors can follow the rules to help complete the review process faster by avoiding the number of revisions.
Required revision in the review process is one of the factors that affect the time-toacceptance. The distribution models in this study were conducted using an overarching duration that included the revision period. It is important and interesting to conduct a comparison study in the future between time-to-acceptance without revision and time-to-revision-to-acceptance for accepted refereed articles. The comparison can only be made possible should the refereed journals release enough valuable data for research.
In addition, some other factors that affect the time-to-acceptance include the journals' review procedures, policies, criteria, reviewers, editorial staffing, and topic. Although this study employed the most recent three years' data, these above-mentioned factors may have already been altered during or after the data collection. Inconsistent procedures or unstable staffing, for example, can subsequently interfere the probability distribution modeling for estimation use. Consistent research and modeling based on updated data is needed to rectify the limitations.