Java – should I use drools in this case?
I will use the university library system to explain my use cases Students register in the library system and provide their personal data: gender, age, Department, previously completed courses, currently registered courses, borrowed books, etc Each book in the library system will define some borrowing rules according to students' data. For example, computer algorithm textbooks can only be borrowed by students currently registered in this category; Another textbook can only be borrowed by students in the mathematics department; There may also be some rules that allow students to borrow up to 2 computers and network books Due to the borrowing rules, when a student searches / browses in the library system, he will only see the books he can borrow Therefore, this truth - seeking comes down to effectively generating a list of books that students are eligible to borrow
The following is how I use drools to design - each book will have a rule that restricts the student data as several fields of LHS. The RHS of book rules only adds the book ID to the global result list, and then all book rules are loaded into rulebase When students search / browse the library system, they will create a stateless session from rulebase and call the student's personal data as facts. Then each book that students can borrow will trigger their book rules and obtain a complete list of books. Students can borrow global transcripts
Several assumptions: the library will handle millions of books; I don't expect Book rules to be too complex. Each rule has at most 3 simple field constraints; The number of students that the system needs to handle is about 100k, so the load is quite heavy My question is: how much memory will drools record if a Million Book rules are loaded? How fast will all these millions of rules be? If drools is appropriate, I would like to hear the best practices of experienced users in designing such a system thank you.
Solution
First, don't make rules for every book Make restrictions – there are far fewer restrictions than books This will have a huge impact on runtime and memory usage
Running a large number of books through a rule engine will become expensive Especially because you won't show all the results to the user: only 10-50 per page One idea I came up with was to use a rule engine to build a set of query conditions (I wouldn't actually do that – see below)
That's what I thought:
rule "Only two books for networking" when Student($checkedOutBooks : checkedOutBooks),Book(subjects contains "networking",$book1 : id) from $checkedOutBooks,id != $book1) from $checkedOutBooks then criteria.add("subject is not 'networking'",PRIORITY.LOW); end rule "Books allowed for course" when $course : Course($textbooks : textbooks),Student(enrolledCourses contains $course) Book($book : id) from $textbooks,then criteria.add("book_id = " + $book,PRIORITY.HIGH); end
But I wouldn't actually do that!
This is how I changed the problem: not showing books to users is a bad experience Users may want to read these books carefully to see which books to download next time Show books, but do not allow checkout of restricted books In this way, each user can only run 1-50 books at a time to execute the rules This will be quite zippy The above rule will become:
rule "Allowed for course" activation-group "Only one rule is fired" salience 10000 when // This book is about to be displayed on the page,hence inserted into working memory $book : Book(),$course : Course(textbooks contains $book),Student(enrolledCourses contains $course),then //Do nothing,allow the book end rule "Only two books for networking" activation-group "Only one rule is fired" salience 100 when Student($checkedOutBooks : checkedOutBooks),id != $book1) from $checkedOutBooks,// This book is about to be displayed on the page,hence inserted into working memory. $book : Book(subjects contains "networking") then disallowedForCheckout.put($book,"Cannot have more than two networking books"); end
I'm using activation groups to ensure that only one rule is triggered, and significantly ensure that they trigger in the order I want
Finally, keep the rule cache Drools allows - and recommends - you only load rules into the repository and then create sessions from it Knowledge base is expensive and conversation is cheap