Tuesday, September 2, 2008

Finished RbYAML on GSoC

After about four months job, I almost finished the goal of rbyaml for GSoC 2008.
Really thanks my mentors Xueyong Zhi and Ola Bini. They helped me a lot on technical stuff and requirements from community. It's my first opensource project, a pretty fantastic experience.
Now, rbyaml supports more YAML1.1 standard and compatible with more YAML1.0 standard than before. The default rbyaml interpreter is for YAML 1.1. You could also use YAML 1.0 interpreter by require 'rbyaml_1.0'.
But the goal which totally compatible with Syck is still on the way.

Although GSoC is finished, but I will still go on working on rbyaml for improving and maintenance, and I also hope I could contribute to more and more opensource projects. I think that's the most important goal of GSoC, that more and more developer contribute for opensource projects, isn't it?

The future direction of rbyaml seems should be as following.
1. Syck Compatibility,
2. YAML new standard supporting, that include 1.1 and 1.2(still in drafts),
3. Should work well on Rubinius, Ruby1.9 and XRuby, etc.

Sunday, July 6, 2008

catch up with jvyamlb test, working on syck test now

GSoC mid-term evaluation is coming.
Last month, I was working on catching up with jvyamlb, now almost all of the current jvyamlb test could be passed successfully on rbyaml parser. I think it could be a little milestone of rbyaml.
Thanks to Ola Bini, the jvyamlb codebase really helped me a lot during fixing rbyaml defects.

These few days, I have started working on syck compatibility. Generally, the issues are about not implemented method and small defects when load/dump strange input stream.
I've already added some missing methods like parse and transform.

RbYAML still needs to be perfected. If you found any defects about rbyaml, please report to http://code.google.com/p/rbyaml/issues/list. Thanks very much.

Friday, May 30, 2008

load document start marker & timestamp

Still working on adding load_spec test, try to fix some bugs during the test process.
1. RbYAML should load "---" as a string, but not a document start marker. The document start marker should be "---\n", "---\t", "--- "(there is a blank after dash), etc.
RbYAML.load("---") => "---"


2. When load timestamp, should parse fraction of a second correctly.
Previously, "2001-12-15T02:59:43.1Z" will be loaded as an Time instance which the microsecond is 1000.
In my opinion, it should be 1e5 microsecond.(1 second equals to 1e6 microsecond)
another thing is if the digit number of fraction part is more than 6, it should parse as a round number. For example,
When "2001-12-15T02:59:43.1234567890000Z", the microsecond should be 123457.

Following is about CRuby.
1. In Syck(CRuby YAML), it will be loaded as 1234567890000 microsecond. then transform to corresponding second, minutes, hours...
Here it will be Sat Dec 15 03:35:30 UTC 2001, and the microsecond is 483647.
2. Time class. As a Time instance, the to_s and inspect method will only accurate to second. You should call usec method to get the microsecond. It might be a bit confusion when you compare two Time instances.

Saturday, May 24, 2008

change about RbYAML#load

These days, I was working on adding more spec test cases, most of them are from jvyamlb.

What have been changed?
1. RbYAML could load an empty instance, that means it doesn't has any value. Previously, rbyaml will throw exceptions during load an instance without value.
For example,
RbYAML.load("!!str") => ""
RbYAML.load("!!null") => nil
#nil is an instance of NilClass

2. RbYAML will return nil when load an empty stream
RbYAML.load("") => nil


Another thing is that I am going to support both YAML1.0 and YAML1.1 for there's differences between them.
You could use "require 'rbyaml_1.0'" to get a YAML1.0 parser.
But you should notice that YAML1.0 parsing is still on the way.

Tuesday, May 20, 2008

conflict between rbyaml and syck

(This conflict will not happen during you use rbyamlinstead of syck in your ruby compiler/interpreter.)

Most of time, you don't want to include yaml lib and rbyaml lib at the same time,
but sometimes yaml lib will be required implicitly during loading other common libraries.
For lots of libraries load yaml in therir implementations.
Here is an example.

What will reproduce the problem?
require 'rubygems' # or any file which contain yaml implicitly
require 'rbyaml'
:symbol.to_yaml # call to_yaml method of any base type object

What is the expected?
call to_yaml which defined in RbYAML library
What do you see instead?
call to_yaml which defined in SYCK library

Most of time the method from SYCK and RbYAML has the same behaviour, output.
But if there's defect in syck, that will effect rbyaml behaviour.
In this example, RbYAML only define to_yaml in Object class, when you call Symbol#to_yaml, it will call the super class method, Object#to_yaml.
But Syck define to_yaml in Object, Symbol and many base type. So if you load rbyaml after syck, rbyaml will certainly overwrite to_yaml method in Object, but will leave to_yaml method in other base type like as Symbol, Hash, etc.
After that, if you call Symbol#to_yaml, it will not call super class method which defined by RbYAML, but call Symbol#to_yaml defined in Syck. That's not our willing.

to_yaml method conflict has been fixed already, but I think there may still exist some other conflicts. I've add an issue about this in code.google.

Monday, May 19, 2008

portation of spec tests

Today, I port all the spec tests about yaml from rubinius to rbyaml.
In order to make all of them could be passed smoothly by rbyaml library, I fixed three small bugs.
1. As RbYAML.tagged_classes, Given int tag uri, Should return Integer .
RbYAML.tagged_classes["tag:yaml.org,2002:int"].should == Integer

2. As Symbol.to_yaml, Should return symbol YAML representation.
:symbol.to_yaml.should == "--- :symbol\n"

3. As RbYAML, Should have tagurize method.
RbYAML.tagurize('wtf').should == "tag:yaml.org,2002:wtf"
RbYAML.tagurize(1).should == 1

Rubinius and Syck really help me a lot during improving rbyaml.
Thank them.