Conversation
|
@marcaurele as long as you are removing the use of "ehCache", what about removing its dependency in our POM.xml files too? |
|
@marcaurele I only see the RMI config removed, can you instead provide the ehcache configuration to be used that can work out of the box by default? The GenericDaoBase initialises and still uses ehcache's Cache, CacheManager, Element etc. |
|
@rafaelweingartner @rhtyd Before doing to much work if people would not agree with my findings I simply wanted to push this PR to open the discussion on ehcache topic. I will post another one removing all the code that use ehcache to clean it up if this one is going to be merged. |
|
@marcaurele the idea of ehcache was to reduce some load to the db server. Can you start a discussion thread on dev@ with details like what and how you'll replace ehcache. And your long term plan re:DB refactoring, the goals, proposed timeline etc. |
|
@rhtyd sure |
|
I have seen some issues with ehcache as well on mgmt servers, we disabled it often on deployment. |
|
@wido do you use some custom configuration, or do you simply keep the one provided by default? My next move will be to remove all the code related to ehcache in CS. Is that ok for you? |
|
@marcaurele We just disable it in the XML, we do not use any custom ones. |
|
@blueorangutan package |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-2794 |
|
@blueorangutan test |
|
@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
Trillian test result (tid-3593)
|
|
code and tests are good. are we going on with this, @wido @rafaelweingartner @rhtyd @ustcweizhou ? /cc @marcaurele |
|
LGMT |
|
I'm in favour of merging this as the ehcache does not work with currently CloudStack at all, all queries end up on the mysql server. |
|
All in agreement, let's merge this! |
Atualizar _labels_ automaticamente ao atualizar entre versões Closes apache#2784 and apache#2913 See merge request scclouds/scclouds!1289
Description
This PR is for deactivating Ehcache in CloudStack since it is not usable. The first commit remove the default RMI cache peering configured for multicast which most of the time cannot work. It also requires to have an interface up which is not always the case while developing offline.
The second commits remove the configuration to activate caching on some DAOs.
Problems
The code in CS does not seem to fit any caching mechanism especially due to the homemade DAO code. The main 3 flaws are the following:
Entities are not expected to be shared
There is quite a lot of code with method calls passing entity IDs value as
long, which does some object fetching. Without caching, this behavior will create distinct objects each time an entity with the same ID is fetched. With the cache enabled, the same object will be shared among those methods. It has been seen that it does generate some side effects where code still expected unchanged entity attributes after calling different methods thus generating exception/bugs.DAO update operations are using search queries
Some part of the code are updating entities based on a search query, therefore the whole cache must be invalidated (see GenericDaoBase:
public int update(UpdateBuilder ub, final SearchCriteria<?> sc, Integer rows);).Entities based on views joining multiple tables
There are quite a lot of entities based on SQL views joining multiple entities in a same object. Enabling caching on those would require a mechanism to link and cross-remove related objects whenever one of the sub-entity is changed.
Final word
Based on the previously discussed points, the best approach IMHO would be to move out of the custom DAO framework in CS and use a well known one (out of scope of this change of course). It will handle caching well and the joins made by the views in the code. It's not an easy change, but it will fix along a lot of issues and add a proven / robust framework to an important part of the code.
Types of changes
How Has This Been Tested?
A cluster of management servers with a manual ehcache RMI configuration has been setup to test different changes in the caching. The RMI cache setup has been verified though cache invalidation being propagated to the other server. A series of integration tests have been run, while figuring out the reason for (random) errors.