My brother was working on a project recently and asked me if I could assist him in developing a script that would loop through all the php files in a directory, look for a specific keyword within curly brackets, and then insert comments before the first <tr> tag appearing before the keyword and after the first </tr> tag appearing after the keyword.
The following code did the trick.
# Set working directory
setwd("D:\\furious")
# Obtain a list of files in the working directory ending in *.php
selected.files <- list.files(pattern = "\\.php$", all.files = T, full.names = TRUE,
recursive = TRUE)
modfiles <- "files modified"
notmodfiles <- "files not modified"
filemod1 <- FALSE
filemod2 <- FALSE
regexp <- "\\{desc\\}"
regexp2 <- "</tr>"
regexp4 <- "<tr>"
# Loop through selected.files list
for (file in selected.files) {
theurl <- file
webpage <- readLines(theurl)
# line that contains the {desc}
startline <- which(regexpr(pattern = regexp, text = webpage) > 0)
i <- startline
if (length(i) > 0) {
filemod1 <- TRUE
# find the </tr> after {desc}
for (i in startline:length(webpage)) {
if (regexpr(pattern = regexp2, text = webpage[i])[1] > 0) {
tr.end.line <- i
break
}
}
# put end description info after </tr>
webpage[tr.end.line] <- paste(webpage[tr.end.line], "\n<!--end description-->")
}
for (i in startline:1) {
if (regexpr(pattern = regexp4, text = webpage[i])[1] > 0) {
tr.start.line <- i
filemod2 <- TRUE
break
}
}
print("ENDLINE")
webpage[tr.start.line] <- paste("<!--description-->\n", webpage[tr.start.line])
# Set output directory so the original files will no be overwritten
setwd("d:\\furious2")
if (filemod1 == TRUE & filemod2 == TRUE) {
fileConn <- file(file)
writeLines(webpage, fileConn)
close(fileConn)
modfiles <- c(modfiles, file)
print(paste("file modified:", theurl))
} else {
notmodfiles <- c(modfiles, file)
print(paste("file NOT modified:", theurl))
}
# output the the updated files to the d:\furious directory
setwd("D:\\furious")
# Print list of modified files
x = cbind(modfiles)
write.csv(x, file = "d:/ModifiedFiles.csv")
}
The first line afer setting the working directory gets the list of files with name matching the regular expression given in the pattern argument as shown:
setwd("D:\\furious")
selected.files <- list.files(pattern="\\.php$", all.files=T, full.names=TRUE, recursive=TRUE)
head(selected.files)
## [1] "./sd_layout_1-_burgundy.php" "./sd_layout_1-_citrus.php"
## [3] "./sd_layout_1-_forest.php" "./sd_layout_1-_gold.php"
## [5] "./sd_layout_1-_marine.php" "./sd_layout_1-_midnight.php"
Then the code searches every line in the documents for the line number containing the keyword specified in the variable regexp. Then it searches downward from that keyword until it finds the expression given by the variable regexp2 and writes <!–end description–> after it. After that, the code searches upward in the document from that same keyword for the first encounter with the expression given by the variable regexp4 and writes <!–description–> before it.
If changes were successfully made, the updated file is written to the furious2 directory and the change is noted in the d:.csv file. This process continues for each matching file in the working directory set at the beginning of the code.
No comments:
Post a Comment