After in depth analysis and various permutations of testing environments, we figured out that in the journey to code check_retriever() function for rdataretriever package, the most difficult hurdle to overcome is not different operating systems (Linux, Windows, MacOS). Its actually the existance of different virtual environments in the system.
When there are more than one virtual environments in the system, a whole new spectrum of possibilities come into picture:
- The user might have Retriever installed in any one of the environments
- While executing R code, it might be possible that the execution and existence of Retriever package are in different environments.
- Or, both of them are in same environment.
- It might be also possible that the user is executing R in a normal system environment and the package is present in a virtual environment.
- Over all these likelihoods, the user could be executing R code through the R terminal or RStudio. This is one of the uncertainties that bug R users the most. The reason being, the way R finds and interprets paths in the systems, is completely different for terminal and IDE. Terminal uses system variables and RStudio uses its own variables, which are entirely different.
Approaches that did not work
The very first was using system terminal commands such as which in Linux based OS and where in Windows, but it appeared to be a inept way to solve the issue as it did not really work for Mac OS. Also, the terminal and IDE execution produced different outputs in terms of paths.
The next solution was leveraging the capabilities of the functions of Reticulate. It has a amazing function py_config() which tracks all the Python version in the system and identifies their path. The approach was to use these paths, and append them to the Sys.getenv(‘PATH’) variable. After this R can try to find Retriever in these locations using Sys.which(). Assuming that py_config() is able to track down all the Python versions, this approach seemed bullet-proof, because if Retriever is not found in any of the paths appended, then it surely does not exist in the system.
The problem was the assumption that py_config() is able to search all the Python versions in the system. In reality, it did not. And the culprit is the existence of multiple virtual environments. The function cannot find Retriever is present in any of the environments except the current.
Approach Implemented
The bottom-line conclusion of all the failures encountered while trying to come out with an invincible function which can track Retriever in the system no matter what is the permutation of OS, virtual environment, Python installation, IDE/Terminal etc., is giving the user the flexibility of feeding in the required path to the rdataretriever package of R. This is done by a new function use_RetrieverPath() which takes in the path and appends it to the PATH variable. This works with terminal/IDE and different operating systems. The important factor is, even if the the user is not present in the same environment as the Retriever, it can still access it from a different environment if its path is already known.