Parsing using CSAW

Hi all (esp. @sweaglesw ),
I need your help to resolve a parsing problem using CSAW CSAW -- PCFG approximation of HPSG grammars, with semantics

I am using Csaw to parse a corpus that contains 3000 sentences, with each sentence is on a different line. I need to parse this corpus into MRS using CSAW. Our bash script is like:

n=1
while read line; do

echo $line | ./csaw erg-1214-x86-64-0.9.25.dat all-treebanks-gp0-2018.pcfg -f > “f_mrs_${n}.txt”

n=$((n+1))
done < $filename

The script allows the execution of the ./csaw command on each line of the corpus, therefore on each sentence. However, some sentences cause “errors” and in some files we can therefore read:

SKIP: Baseball, the only major sport without a time element, could soon be on the clock.

NOTE: 0 readings

NOTE: tsdb parse: (:total . 62) (:treal . 62) (:tcpu . 62) (:others . 4896524)

However, if afterwards we use the same sentence manually with the command below, we get the MRS format without errors.

echo “Baseball, the only major sport without a time element, could soon be on the clock.” | ./csaw-0.9.25/csaw csaw-0.9.25/erg-1214-x86-64-0.9.25.dat csaw-0.9.25/all-treebanks-gp0-2018.pcfg -f > out.txt

Could you help me to resolve the problem?
Thanks in advance.

While I can’t explain the behavior you report (and hope someone else can), I notice that you are using a version of the grammar (ERG 1214) which is not fully consistent with the PCFG model you’re using, which was trained on the ERG 2018 version of the grammar. I would encourage you to update the grammar version to 2018 for better consistency.